Introduction
ChemAICluster is a tool designed to perform clustering on molecular data based on SMILES strings. This tool uses various clustering algorithms like k-means and hierarchical clustering to group molecules based on their similarity. It also provides visualizations such as cluster plots and dendrograms.
How to Use
- Upload File: Prepare your dataset in CSV format. The CSV file should have at least two columns: an ID column and a SMILES string column.
- Select Clustering Algorithm: Choose between k-means and hierarchical clustering. You will also have the option to specify the number of clusters.
- Submit: Click the 'Submit' button to perform clustering. The app will process the data and return the cluster assignments, which can be downloaded as a CSV file.
- Visualization: After clustering, you will be able to view the 2D cluster plot (PCA) and dendrogram (if hierarchical clustering is selected).
Features
- SMILES Data Upload: Upload molecular data in SMILES format for clustering.
- Clustering Options: Choose from different clustering algorithms such as k-means and hierarchical.
- Visualization: View cluster plots using PCA for k-means or a dendrogram for hierarchical clustering.
- Download Options: Download clustered data and visualizations (plots and dendrograms).
Output
Once the clustering is complete, you will be able to download the following:
- Clustered Data: CSV file with cluster assignments for each molecule.
- Visualizations: PCA cluster plot and dendrogram (if applicable).
References
- Sharma R,Cluster analysis to identify prominent patterns of anti-hypertensives: A three-tiered unsupervised learning approach,2020. Informatics in Medicine Unlocked DOI:10.1016/j.imu.2020.100303
- Ikotun et al, K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,2023.Information Sciences.
- Altman N and Krzywinski M, Clustering, 2017, Nature https://doi.org/10.1038/nmeth.4299
Contact
If you have any questions or need support, feel free to contact us at [email protected].