GEV Mixture Model (as opposed to GMM)
2 views (last 30 days)
Show older comments
Hi,
I'm looking for advice on using a GEV mixture model for clustering as opposed to a Gaussian Mixture Model. I want to compare the results of the two. I'm wondering if it is feasible?
The data set I will be using is approx. 1TB of time series data.
0 Comments
Answers (1)
Ayush Kashyap
on 19 Jun 2023
If you have a good understanding of the data's underlying distribution, you can use a Generalized Extreme Value (GEV) mixture model for clustering. Extreme events, such as the distribution of storm intensity or the peak levels of a flood, are frequently modeled using the GEV distribution, a continuous probability distribution. It attempts to estimate those parameters from the data, as with any mixture model, assuming that your data come from a mixture of several subpopulations, each of which has its own parameters.
Where Gaussian distributions may not be sufficient, GEV mixture models can capture more complex data distribution shapes, particularly tail behaviors, in comparison to Gaussian mixture models. As a result, if your data have high values or long tails, you might find that a GEV mixture model fits the data better than a Gaussian mixture model.
However, you should be aware that working with GEV mixture models can be more time-consuming than working with Gaussian mixture models, particularly when dealing with large datasets like the one you describe. Compared to the mean and covariance parameters used in Gaussian mixture models, these models require fitting multiple distribution parameters for each mixture component, which can be more complicated and take longer to compute. To process your data quickly, you may need to use advanced computing architectures like GPUs or clusters.
In conclusion, if your data have extreme values or large tails, a GEV mixture model can be used for clustering. However, you should be aware that these models may necessitate specialized computing resources and can be more computationally demanding to work with than Gaussian mixture models. You can determine which method provides the best representation and fit to your data by comparing the results of the two models. This will help you comprehend the underlying distribution of your data.
2 Comments
Ayush Kashyap
on 22 Jun 2023
You may refer to following documentations for a better understanding of clustering and GEV in particular:
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!