# stock-clusters > Analyze stock tickers by return and volatility using K-means clustering. Use when exploring investment opportunities, identifying risk profiles, comparing market segments, finding high-performers, or analyzing portfolio positioning. - Author: Francisco Perez-Sorrosal - Repository: francisco-perez-sorrosal/stockpile - Version: 20260128234524 - Stars: 0 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/francisco-perez-sorrosal/stockpile - Web: https://mule.run/skillshub/@@francisco-perez-sorrosal/stockpile~stock-clusters:20260128234524 --- --- name: stock-clusters description: Analyze stock tickers by return and volatility using K-means clustering. Use when exploring investment opportunities, identifying risk profiles, comparing market segments, finding high-performers, or analyzing portfolio positioning. --- Cluster stocks by annualized return and volatility using K-means to identify high-performers, stable assets, and outliers worth investigating. ## Workflow Overview 1. **Get tickers** - Use MCP `lookup` for tickers/indexes, or user provides them 2. **Ensure metrics** - Use MCP `refresh_metrics` to calculate return/volatility 3. **Generate elbow curve** - Run script with `--elbow` to determine optimal k 4. **Analyze elbow image** - View the image to identify the elbow point 5. **Run clustering** - Execute with chosen k value 6. **Present visualization** - Show the clusters.html to user 7. **Interpret results** - Explain cluster characteristics ## Step-by-Step Guide ### Step 1: Get Ticker Symbols Use the **ticker-cache MCP server** to get tickers: ``` # Index tickers lookup("sp500") lookup("nasdaq100") lookup("dow") # Specific tickers lookup("AAPL,MSFT,GOOGL,NVDA,META") # Company names (auto-resolves) lookup("Apple,Microsoft,Google") ``` ### Step 2: Calculate Metrics via MCP Ensure return/volatility metrics are calculated: ``` # For specific tickers refresh_metrics("AAPL,MSFT,GOOGL,NVDA,META") # For an index refresh_metrics("nasdaq100") ``` This fetches 1-year price history and calculates annualized metrics. ### Step 3: Generate Elbow Curve Run the clustering script with `--elbow` to find optimal cluster count: ```bash python scripts/stock_clusters.py --tickers "AAPL,MSFT,GOOGL,NVDA,META" --elbow --elbow-output elbow.png ``` For index analysis: ```bash python scripts/stock_clusters.py --index nasdaq100 --elbow --elbow-output elbow.png ``` ### Step 4: Analyze the Elbow Curve **You MUST view the elbow curve image** (`elbow.png`) to determine optimal k. Look for: - The **"elbow" point** where the curve bends sharply - This is where adding more clusters yields diminishing returns - The k value at the elbow is the optimal cluster count **Fallback guidelines** (if elbow is unclear): - Small sets (< 10): k=2 or k=3 - Medium sets (10-50): k=3 to k=5 - Large sets (50+): k=4 to k=7 ### Step 5: Run Clustering with Chosen k ```bash python scripts/stock_clusters.py --tickers "AAPL,MSFT,GOOGL,NVDA,META" --clusters --output clusters.html ``` Replace `` with the elbow value (e.g., `--clusters 3`). Cluster labels are assigned automatically: - **Strong Return, Low Vol**: Best risk-adjusted performers - **Strong Return, High Vol**: High risk/reward opportunities - **Moderate Return, Low Vol**: Stable performers - **Low Return, Low Vol**: Defensive stocks - **Negative Return**: Underperformers ### Step 6: Present the Visualization **Show `clusters.html` to the user** before interpreting. This allows them to: - Explore the interactive scatter plot - Hover over points for stock details - Form observations before your analysis ### Step 7: Interpret Results **Scatter plot axes:** - X-axis: Annualized return (higher = better performance) - Y-axis: Annualized volatility (higher = more risk) **Key patterns:** - Upper-left: High return, low volatility (ideal) - Upper-right: High return, high volatility (aggressive) - Lower-left: Low return, low volatility (defensive) - Outliers: Special situations worth investigating **Summarize:** 1. Which cluster has best risk-adjusted performers 2. Notable stocks in each cluster 3. Surprising outliers or patterns ## Script Commands | Task | Command | |------|---------| | Analyze custom tickers | `python scripts/stock_clusters.py -t "AAPL,MSFT,GOOGL"` | | Analyze with elbow | `python scripts/stock_clusters.py -t "..." --elbow --elbow-output elbow.png` | | Analyze NASDAQ-100 | `python scripts/stock_clusters.py -i nasdaq100` | | Analyze S&P 500 | `python scripts/stock_clusters.py -i sp500` | | Analyze Dow Jones | `python scripts/stock_clusters.py -i dow` | | Set cluster count | `python scripts/stock_clusters.py -t "..." -k 4` | | Save interactive chart | `python scripts/stock_clusters.py -t "..." -o clusters.html` | | Export to CSV | `python scripts/stock_clusters.py -t "..." --csv results.csv` | ## Output Formats ### Console ``` === Cluster Summary === Strong Return, Low Vol (2 stocks): Avg Return=45.6%, Avg Volatility=31.8% Strong Return, High Vol (1 stocks): Avg Return=56.8%, Avg Volatility=45.4% === Top Performers by Return === GOOGL (Alphabet Inc.): Return=59.4%, Volatility=32.1% [Strong Return, Low Vol] ``` ### CSV Export ```csv Ticker,Returns,Volatility,Cluster,Name,Sector,ClusterLabel AAPL,0.127,0.320,1,Apple Inc.,Consumer Electronics,"Strong Return, Low Vol" ``` ### Interactive HTML Plotly scatter plot with hover tooltips showing ticker, company, sector, and metrics. ## Data Flow ``` MCP Server (ticker-cache) Skill Script │ │ │ refresh_metrics("sp500") │ ├────────────────────────────────>│ │ │ │ (writes to cache file) │ │ │ │ │ reads cache │ │ clusters with scipy │ │ generates plotly viz │ │ │ ├──> elbow.png │ ├──> clusters.html │ └──> console output ``` ## Dependencies **MCP Server** (required for data): ```bash cd .claude/mcps/ticker-cache && make install ``` **Python packages** (pre-installed in Claude environments): - pandas, numpy, scipy, matplotlib **Auto-installed** (for interactive charts): - plotly ## See Also - `/ticker` - Look up ticker symbols and company details