β οΈ Research / Paper Only β NOT financial advice.
Two different systems on this page β know which is which:
π Multi-Factor Methodology = scores individual picks right now (AMZN scores 133 today). These are research signals, displayed below. They are NOT money-ready.
π― 4 Promotion Gates = evaluates whether a full strategy (hundreds of trades across months) is proven for real capital. Currently 0/6 asset classes pass. Gates do NOT apply to individual picks on this page.
β
Pro-level overlay (2026-06-13): STRONG_BUY demoted when intrabar symΓdir forward WR<50% (nβ₯5) or class intrabar FAIL at nβ₯100.
All picks logged to ejaguiar1_stocks.picks_now_tracker for tracking.
π Today's Market Context
β± Loaded: β
π§ How Each Pick Is Generated β Full Multi-Factor Methodology β¬
THIS IS THE SCORING SYSTEM FOR PICKS BELOW
Every pick is scored on ALL 5 factors below simultaneously. No pick gets a pass without all factors evaluated:
π΅ Momentum (30% weight): 3-month / 1-month / 5-day returns. A stock up 15% in 3 months that pulls back -6% in 5 days = TREND+DIP (buy the dip in an uptrend). Data: yfinance daily OHLCV
π’ Mean Reversion (20% weight): RSI (14-period) + Bollinger Band %-B position. RSI <35 + price at lower band = statistically significant oversold bounce setup. Data: yfinance daily OHLCV
π‘ Analyst Consensus (25% weight): Wall Street recommendationMean + targetMeanPrice. STRONG_BUY with 25%+ upside = high-conviction fundamental overlay. Data: yfinance .info (40+ analysts per stock)
π΄ Vol-Adjusted Safety (15% weight): Realized vol (RVOL) and ATR sizing. Low vol (TLT 9%) = 8% position. High vol (NVDA 46%) = 2% position. Kelly x Risk-Parity sizing
βͺ DB Edge Overlay (10% weight): Our own resolved pick history from at_pick_outcomes (39,880 closed trades). Symbols with nβ₯5 closed trades and WR>55% get a bonus. Data: ejaguiar1_stocks
π― The 4 Promotion Gates β ELI5 β¬
NOT applied to picks below β these are for strategy graduation
A strategy must pass ALL 4 gates before it can trade real money. Think of them as a bouncer, a statistician, a track record check, and a sample-size check.
Gate 1 β DSR (Deflated Sharpe Ratio)
What it is: The Sharpe Ratio measures how good your returns are relative to risk (volatility). But the regular Sharpe can be gamed β if you test 100 strategies and pick the best one, it’ll look great just by luck. DSR deflates (punishes) the Sharpe based on how many strategies were tested.
ELI5: 1,000 monkeys flip coins. One gets heads 20 times in a row. Regular Sharpe says “genius!” DSR says “you tested 1,000 monkeys β expected by chance.” st_fear_greed_contrarian: DSR=19.70 β
Gate 2 β FDR p-value (False Discovery Rate)
What it is: p-value is the probability the strategy’s results happened purely by luck. Lower = better. p=0.05 means “5% chance this is noise.” FDR adjusts for testing many strategies at once (if you test 56, ~3 will look significant by luck).
ELI5: You’re hiring 56 job candidates. Some ace the interview by luck. FDR ranks everyone and only hires down the list until the expected flukes stay below 5%. st_fear_greed_contrarian: p=0.0000 β
Gate 3 β Win Rate β₯ 40%
What it is: % of winning trades out of total closed forward (live) trades. Not backtest β real money or paper.
ELI5: You don’t need to be right more than half the time β a 40% WR with 2:1 reward-to-risk still makes money. The gate just filters out clearly broken strategies. st_fear_greed_contrarian: WR=53% β
Gate 4 β Trade Count β₯ 30
What it is: Requires enough real trades for the stats to be meaningful. With 5-10 trades, a 100% win rate means nothing β you just got lucky.
ELI5: Flipping a coin 3 times and getting 3 heads doesn’t prove it’s rigged. Flipping 300 times and getting 200 heads? Now you have evidence. st_fear_greed_contrarian: n=430 β
Note: Profit Factor (PF) is tracked separately for tier classification (T1/T2) but is NOT one of the 4 gates. A strategy can be promoted with PF<1.5 if it passes the 4 gates β the tier reflects how strong the edge is.
π§© How Methodology + Gates Work Together: These are two separate layers. The multi-factor methodology above scores each pick RIGHT NOW (e.g., “AMZN scores 133/100 today”). These are research signals β logged to picks_now_tracker for tracking, not money-ready. The 4 promotion gates evaluate whether a full strategy (like the quant_multifactor_screener itself, or sub-strategies like st_fear_greed_contrarian) has proven itself over hundreds of trades and is ready for real capital. Individual picks on this page do NOT need to pass the gates β the gates are for strategy-level graduation. Currently 0/6 asset classes have a strategy that passes all 4 gates, which is why the disclaimer says “Zero asset classes are currently Money-Ready” at the top of this page.
π What does "Risk-Off -3% to -6%" mean?
It means the whole market is in fear mode today β stocks sold off 3-6% and crypto sold 16-23%. This is the CONTEXT, not your personal risk. Your actual risk is set by the Stop-Loss (SL) on each pick β which is 4-7% below entry depending on the pick. Example: You put $1,000 in AMZN. If it hits the stop-loss at $234 (-4.9%), you lose $49. That's it. The market being -3% β you automatically lose 3%.
π― How TP and SL work (ELI5)
Take Profit (TP): The price where you exit and collect your gain. For AMZN: entry $246 β TP $280 = you make +$34/share (+13.8%).
Stop-Loss (SL): The price where you exit automatically to limit your loss. For AMZN: SL $234 = max loss -$12/share (-4.9%).
R:R Ratio: AMZN's is 2.8:1 β for every $1 you can lose, you can make $2.80. Professional traders want R:R β₯ 1.5.
πΏ Safety Tiers β Like a Ski Hill
π’ GREEN
TLT, IEF, SHY β US Government Bonds. You get paid ~4-5%/year just to hold them. When markets crash, these usually go UP as people flee to safety. Lowest daily price swings. Anyone can hold these.
π‘ YELLOW
AAPL, EUR/USD, GBP/USD β Moderate. Resilient stocks or small-move forex plays with tight stop-losses. Fine for most investors.
π΄ RED
AMZN, GOOGL, MU β Higher volatility. Strong fundamentals but 5-10% weekly swings. Size smaller β don't put in more than you're OK seeing drop 10% first.
β« BLACK
NVDA, AVGO, AMD β Can move 10% in one day. Highest upside and highest volatility. Expert/advanced investors only.
β
TOP PICKS BY ASSET CLASS
β³ Loading picks from picks_now.json...
π FORWARD-TESTED PERFORMANCE β every pick logged BEFORE its outcome was known Β· first-touch TP/SL resolver, 10d time-exit Β· deduped 1 pick/symbol/day
β³ Loading track recordβ¦
πΉ LIVE MARK-TO-MARKET PNL β current unrealized PnL from entry to today's price Β· refreshed hourly
β οΈ Unrealised β profit. These are OPEN positions marked at today's price β most haven't had a chance to hit their stop yet, so this panel skews positive in a rally. The HONEST realised result is the FORWARD-TESTED panel below: net-negative (β14.4% cum Β· 31.9% WR). Independent SL-wins-ties first-touch re-resolution confirms net PF 0.82 (no subset clears the bar). No proven forward edge β paper/research only, do not size.
π€ Extended AI Model Panel β FYI only Β· does not change pick scores Β· covers top 4 EQUITY symbols
Scope: 7 models reviewed AMZN Β· NVDA Β· AAPL Β· META (the 4 largest EQUITY picks). BOND/ETF/FOREX/COMMODITY picks are not reviewed by AI β they use quant-only scoring (RSI, Bollinger, momentum, vol).
Each model received the same 7 raw signals so verdicts are comparable. Here's what each means:
RSI = Relative Strength Index (0-100) β measures if a stock is oversold (<35, might bounce) or overbought (>70, might drop). Models use this to gauge short-term entry timing.
Piotroski F-Score (0-9) β company financial health scorecard. 9/9 = excellent (profitable, growing, no bad debt). 4/9 = warning signs. Models weight this heavily because financially healthy companies survive selloffs.
Altman-Z β bankruptcy risk score. Above 3.0 = very safe (will stay solvent). Below 1.8 = danger zone. NVDA at 51 is essentially indestructible; META at 7.9 is still safe but less so.
Analyst count + target β how many Wall Street analysts cover the stock and what their average 12-month price target is. More analysts + higher target = more institutional conviction. AMZN's 94 analysts with $307 target gives confidence.
EPS growth β earnings per share growth year-over-year. The #1 driver of stock prices long-term. +215% (NVDA) means the business is exploding; models like this for growth thesis.
RVOL (Realized Volatility) β how much the stock typically moves in a year. 9% (TLT) = calm; 46% (NVDA) = wild. Models use this for sizing: lower vol = safer to buy more.
3-month momentum β total return over the last 3 months. Positive with a recent dip (AMZN: +15% 3m, -9% 5d) = TREND+DIP = buy the pullback. Negative with more negative = downtrend = avoid.
No score anchoring β all model verdicts are independent from the quant score above.
| Model / Family |
AMZN |
NVDA |
AAPL |
META |
Key Stats Cited |
Framework Used |
| Hybrid Ensemble (large) Local proxy Β· multi-model blend |
β
APPROVE RSI 38, 94 analysts, EPS+75% |
β οΈ CAUTION Altman-Z 51.4 great; RVOL 46% too high |
β
APPROVE Piotroski 9/9; most resilient; low vol |
β REJECT 3m=-8.3%; Piotroski 4/9; SL risk |
Altman-Z, Piotroski, RVOL, 3m momentum, EPS |
quality-resilience |
| Cloudflare LLaMA Cloudflare Workers AI |
β
APPROVE RSI 38, 94 analysts, EPS+75% |
β οΈ CAUTION 79 analysts bull; RVOL 46% flagged |
β οΈ CAUTION Piotroski 9/9 but only +5.4% upside |
β REJECT Piotroski 4/9; 3m trend weak |
Piotroski, analyst count, 3m trend, upside % |
quality screening |
| DeepSeek Chat (proxy) Local proxy Β· DeepSeek-direct |
β
APPROVE RSI 38 + 94 analysts = safe entry |
β οΈ CAUTION EPS+215% great; RVOL 46% risky today |
β
APPROVE Piotroski 9/9 + +0.3% today = safe haven |
β REJECT Piotroski 4/9; 3m=-8.3% |
RSI, Altman-Z, Piotroski, RVOL, 3m trend |
mean-reversion + quality |
| LLaMA 3.3 70B (Groq) Meta AI Β· Groq inference |
β
APPROVE RSI 38; 94 analysts STRONG_BUY |
β οΈ CAUTION EPS+215% compelling; RVOL 46% β wait |
β
APPROVE Piotroski 9/9; +0.3% defensive |
β REJECT Piotroski 4/9; 3m trend negative |
RSI, analyst consensus, Piotroski, 3m trend |
quality-defensive |
| LLaMA 4 Scout (Groq) Meta MoE next-gen Β· Groq |
β οΈ CAUTION Strong fundamentals; risk-off day risk |
β REJECT RVOL 46% in risk-off = catching a knife |
β
APPROVE Piotroski 9/9; low vol; best risk-off pick |
β REJECT Piotroski 4/9; high ask in selloff |
RVOL, market regime, Piotroski, SL downside |
risk-parity / regime-aware |
| Qwen3-32B (Alibaba/Groq) Alibaba DAMO Academy |
β
APPROVE Analyst sentiment + fundamentals align |
β οΈ CAUTION EPS+215% strong; sector volatile in selloff |
β
APPROVE Piotroski + analyst support + low vol |
β REJECT Weak Piotroski + high valuation ask |
Analyst consensus, Piotroski, RVOL, valuation |
fundamental screening |
| DeepSeek Chat (API direct) DeepSeek Β· direct API key |
β
APPROVE RSI 38 + 94 analysts = compelling entry |
β οΈ HOLD EPS+215% offset by 46% vol; RSI neutral |
β
APPROVE Perfect quality + positive relative strength |
β REJECT Weak fundamentals + negative momentum |
RSI, Altman-Z, Piotroski, 3m trend, analyst count |
quality-momentum |
π Expand: AI verdicts for remaining 17 picks (5 models) click to show
| Sym |
Class |
Score |
Verdict |
Key Stat |
Methodology |
Model reasons |
| AVGO |
EQUITY |
128 |
β
5/5 |
RSI 39.9 oversold |
momentum+quality |
5 approve: oversold + analyst target +33% β risk-off |
| GBPUSD=X |
FOREX |
92 |
β
5/5 |
DB_n=114 WR=58.8% edge |
DB-evidence |
5 approve: DB-corroborated WR=58.8% n=114, only screener pick with real proof |
| AMD |
EQUITY |
100 |
β οΈ 2/5 |
DB_n=29 DB_WR=41.4% |
momentum+quality |
2 approve: best WR in semi universe; 3 reject: 130% 3m = extended |
| DIS |
EQUITY |
96 |
β οΈ 2/5 |
RSI 41 near oversold |
quality |
2 approve: pullback in downtrend; 3 reject: -1.9% 3m = no trend |
| MA |
EQUITY |
96 |
β οΈ 2/5 |
-5% 3m = cheap entry |
quality |
2 approve: defensive payments; 3 reject: -5% 3m downtrend avoiding |
| V |
EQUITY |
100 |
β οΈ 2/5 |
+0.9% 5d relatively flat |
quality |
2 approve: stable; 3 reject: unexciting in selloff, no catalyst |
| GOOGL |
EQUITY |
121 |
β 1/5 |
DB_n=7 WR=28.6% |
momentum+quality |
4 reject: 20% 3m gain = extended; analyst target already baked in |
| IWM |
ETF |
96 |
β 1/5 |
DB_n=0 no history |
momentum |
4 reject: small-cap etf gets hit hardest in selloffs |
| LRCX |
EQUITY |
98 |
β 0/5 |
DB_n=0 no history |
momentum |
0/5: 43% 3m gain extended, semi equipment in selloff = avoid |
| MSFT |
EQUITY |
90 |
β 1/5 |
-7.5% 5d sharp drop |
quality |
4 reject: -7.5% 5d in selloff, no DB proof; 1 approve: buy dip |
| MU |
EQUITY |
100 |
β 0/5 |
DB_n=0 no history |
momentum |
0/5: 122% 3m gain massively extended; no DB proof; semi selloff |
| QQQ |
ETF |
96 |
β 1/5 |
RSI 48 neutral |
momentum |
4 reject: tech ETF gets hammered in risk-off; RSI neutral |
| TXN |
EQUITY |
95 |
β 0/5 |
46% 3m extended |
momentum |
0/5: 46% 3m run = extended; semi sector; 0 DB proof |
| UBER |
EQUITY |
91 |
β 0/5 |
-4.2% 3m = downtrend |
quality |
0/5: -4.2% downtrend, unprofitable in risk-off, avoid |
| UNH |
EQUITY |
94 |
β 0/5 |
RSI 68 near overbought |
momentum |
0/5: RSI 68 near overbought; 40% 3m gain extended; no proof |
| VGT |
ETF |
96 |
β 1/5 |
26% 3m extended |
momentum |
4 reject: tech ETF + extended 3m + no DB proof |
| AMAT |
EQUITY |
96 |
β 0/5 |
33% 3m extended |
momentum |
0/5: semi equipment, extended, bad entry on risk-off |
5 models: Hybrid Ensemble Β· Cloudflare LLaMA Β· DeepSeek Chat Β· LLaMA 3.3 (Groq) Β· Qwen (Groq). Only AVGO and GBPUSD=X had unanimous APPROVE across all models in this batch. Evaluated: June 6, 2026.
π Multi-Model AI Review Panel (informational β does not change scores)
Each pick was evaluated by 7 independent AI models. Verdicts are informational β pick scores unchanged. Evaluated: June 6, 2026 00:37β01:15 EST.
AMZN Β· $246.03 LONG
β
6/7 APPROVE Β· 1/7 CAUTION
Why strong buy: 3-month uptrend + 5-day flush = classic TREND+DIP. RSI 38 oversold. 62+ analysts STRONG_BUY. Piotroski 6/9, Altman-Z 5.1 safe. Average target $307 = 25% above today.
β
Hybrid Ensemble: quality-momentum β "strong fundamental growth, oversold RSI suggests bounce"
β
Cloudflare LLaMA: momentum+quality β "analyst consensus + EPS growth support bounce"
β
DeepSeek Chat (proxy): mean-reversion+quality β "RSI 38 oversold + 94 analysts STRONG_BUY"
β
LLaMA 3.3 70B: quality-momentum β "oversold + analyst consensus in risk-off"
β
DeepSeek Chat (direct): quality-momentum β "key stats: RSI 38 oversold, 94 analysts STRONG_BUY"
β οΈ paid-mode-large (proxy): CAUTION/HOLD β "strong fundamentals but risk-off day makes entry risky; wait for clearer direction"
β
Others (2 models): APPROVED β cited analyst consensus + oversold RSI
NVDA Β· $205.10 LONG
β οΈ DISPUTED Β· 5/7 CAUTION/HOLD Β· 1 REJECT Β· 1 APPROVE
Why split: EPS +215% YoY + Piotroski 7/9 + Altman-Z 51.4 (exceptional) vs 46% RVOL (highly volatile). In a risk-off market, high-vol names get sold first. The fundamentals are great but the entry timing is debated.
β
LLaMA 3.3 70B: APPROVE β "EPS+215% + exceptional Altman-Z outweigh vol concern"
β οΈ Hybrid Ensemble: CAUTION β "exceptional financial health and growth, but high RVOL 46% warrants caution"
β οΈ Cloudflare LLaMA: CAUTION β "high analyst consensus + exceptional Altman-Z but high vol is a concern"
β οΈ DeepSeek Chat (proxy): CAUTION β "EPS+215% + Altman-Z 51.4 compelling but RVOL 46% on risk-off day"
β οΈ DeepSeek Chat (direct): HOLD β "exceptional earnings offset by elevated volatility"
β paid-mode-large (proxy): HOLD β "strong fundamentals overshadowed by risk-off sentiment; SL at $190 vulnerable"
β οΈ 1 additional model: CAUTION β volatility too high for current regime
AAPL Β· $307.34 LONG
β
6/7 APPROVE Β· 1/7 CAUTION (safety over upside)
Why near-unanimous: Perfect 9/9 Piotroski. Most resilient stock today (+0.3% vs market -3%). Lowest equity RVOL (18.5%). All models said: if you have to buy something today, AAPL is the one β it's not going to crash.
β
Hybrid Ensemble: quality-resilience β "perfect Piotroski 9/9 + +0.3% today = ideal defensive play"
β
DeepSeek Chat (proxy): risk-parity β "lowest vol + perfect quality score = safest haven in selloff"
β
LLaMA 3.3 70B: quality-defensive β "perfect Piotroski + resilience in down market = safe-haven"
β
DeepSeek Chat (direct): quality-momentum β "perfect quality score + positive relative strength in down market"
β οΈ Cloudflare LLaMA: CAUTION β "perfect score supports bounce, but +0.3% performance not impressive as a BUY signal"
β
2 additional models: APPROVED β cited 9/9 Piotroski + lowest vol
META Β· $593.00 LONG
β 7/7 REJECT Β· unanimous rejection
Why ALL 7 models rejected: Piotroski 4/9 is the weakest financial health score in the entire pick universe. The 3-month trend is DOWN (-8.3%) β this is NOT a dip in an uptrend, it's an actual downtrend. Paying $593 with shaky fundamentals into a market selloff was deemed the worst risk/reward of all 4 picks.
β Hybrid Ensemble: quality-momentum β "negative 3m trend + 4/9 Piotroski + SL -5.6% = unsuitable"
β Cloudflare LLaMA: quality β "weak Piotroski + poor recent performance outweigh strong analyst consensus"
β DeepSeek Chat (proxy): quality-momentum β "weak fundamentals + negative momentum outweigh high target"
β LLaMA 3.3 70B: trend-valuation β "weak Piotroski + negative trend = unattractive on risk-off day"
β DeepSeek Chat (direct): quality-momentum β "weak basics + negative trend unsuitable for risk-off"
β paid-mode-large (proxy): β "4/9 Piotroski + 3/3 AI rejections = asymmetric downside to $560"
β 1 additional model: REJECTED β same reasoning: Piotroski 4/9 + 3m negative momentum
Methodology: Each model was given the same 4 equity picks with full fundamental data (price, TP, SL, RSI, momentum, Piotroski, Altman-Z, analyst consensus, EPS growth, volatility).
Models rated independently via local proxy (Hybrid Ensemble, Cloudflare LLaMA, DeepSeek Chat (proxy), paid-mode-large, LLaMA 4 Scout, Qwen3-32B) and direct APIs (LLaMA 3.3 70B via Groq, DeepSeek Chat via deepseek.com).
The key debate was timing vs fundamentals β all models agreed the long-term case is strong for AMZN/NVDA/AAPL, but split on whether this risk-off day is the right entry.
π° Money-Ready Criteria β how close each asset class is to full qualification
A class must pass ALL 6 gates to be money-ready. Data from money_ready_verdict.json (live DB: ejaguiar1_stocks.at_raw_picks).
0/8 pass. EQUITY needs n=29 more resolved picks; CRYPTO needs WR + PF improvement.
| Class |
n (β₯100) |
WR (β₯50%) |
PF (β₯1.5) |
DSR |
MDD |
Recency |
Gates Passed |
Verdict |
EQUITY closest to T2 |
71 β β29 |
54% β
|
1.84 β
|
β
|
β |
β
|
3/6 |
INSUFFICIENT_DATA |
CRYPTO n OK but weak WR/PF |
171 β
|
48% β β2pp |
0.95 β |
β |
β |
β
|
1/6 |
NOT_READY |
COMMODITY new candidates emerging |
15 β β85 |
40% β |
1.10 β |
β
|
β
|
β
|
3/6 |
INSUFFICIENT_DATA |
| ETF |
18 β β82 |
33% β |
0.71 β |
β |
β |
β
|
1/6 |
INSUFFICIENT_DATA |
| FOREX |
22 β β78 |
18% β |
0.04 β |
β |
β |
β
|
1/6 |
INSUFFICIENT_DATA |
| FUTURES |
15 β β85 |
13% β |
0.41 β |
β |
β
|
β
|
2/6 |
INSUFFICIENT_DATA |
BOND no data |
0 β |
β |
β |
β |
β |
β |
0/6 |
INSUFFICIENT_DATA |
π― Closest Edge Candidates β detailed breakdown
How each candidate scores on the 6 money-readiness gates:
| Candidate |
n (β₯100) |
WR (β₯50%) |
PF (β₯1.5) |
DSR |
MDDβ€20% |
Recency |
Gap to money-ready |
stocks_rsi2_pullback LONG EQUITY Β· strategy (multi-symbol) |
894 β
|
58.8% β
|
2.68 β
|
β
|
β
|
β |
RECENCY FAIL 0 new picks in 8 days (last emission May 29). 5/6 gates passed except emissions. β οΈ DARK β emission pipeline broken, 14d WR collapsed to 29.9%. |
GBPUSD=X LONG FOREX Β· AlphaEngine (single-symbol) |
114 β
|
58.8% β
|
β |
β |
β
|
β
|
PF/DSR UNKNOWN Meets nβ₯100 + WRβ₯50% + recency. PF estimate needs live PnL tracking. Current: entry 1.3336, TP 1.3536, SL 1.3203. β οΈ Single-source (AlphaEngine only). |
RENDERUSDT inverse_ml SHORT CRYPTO Β· actively emitting |
15 β need 85 more |
80% β
|
7.7 β
|
β |
β |
β
|
SMALL-n SAMPLE WR/PF excellent but n=15 only. Actively emitting (last: Jun 06 03:51 UTC). At current rate, needs ~3-4 weeks to reach n=100. Best T2 candidate for CRYPTO. π΅ WATCH. |
π AI Tournament T1 β What the best models do differently
Forward-tested picks, n=43-89 resolved per model. Ranked by combined WR + PF score.
cursor_agent Rank #2 Β· 66.1% WR
PF 2.35 Β· n=59 resolved
Highest WR of all models β balanced across classes.
β’ Avg win about +2.3%, avg loss about +1.0% = positive expectancy across the board
β’ Best: IWM LONG (100% n=5), BTCUSDT SHORT (100% n=2), META SHORT (100% n=2)
β’ Worst: ETHUSDT LONG (0% n=2), GC=F LONG (0% n=2), IEF LONG (0% n=2)
β’ Only model with no single-symbol WR disaster (all above 20%)
deepseek_v4 Rank #1 Β· PF 3.72
55.8% WR Β· n=43 resolved
Highest PF β few big wins offset losses.
β’ Avg win about +4.8% (about 2x bigger than other models' avg wins)
β’ Best: QQQ LONG (100% n=2), OJ=F LONG (50% n=2), USDCAD SHORT (50% n=2)
β’ Worst: NEARUSDT SHORT (0% n=2), BND SHORT (0% n=2), IEF LONG (33% n=3)
β’ Risk: lower n (43) = wider confidence interval; PF may regress with more emissions
llama4_scout Rank #3
61.4% WR Β· PF 2.26 Β· n=57
Solid but unremarkable β similar profile to cursor_agent.
β’ Best: SPY LONG (100% n=3), APTUSDT LONG (100% n=2), NZDUSD SHORT (100% n=2)
β’ Worst: MSFT LONG (0% n=3), BTCUSDT LONG (0% n=3), QQQ SHORT (0% n=2)
β’ Slightly lower WR on top picks vs cursor_agent
grok3 Rank #4
58.4% WR Β· PF 2.02 Β· n=89
Most picks, lowest PF β spread too thin.
β’ Makes about 2x more picks than other models (89 resolved, 113 total)
β’ Best: GBPUSD LONG (100% n=2), EURUSD SHORT (100% n=2), BTCUSDT SHORT (66.7% n=3)
β’ Worst: ETHUSDT LONG (0% n=4), SPY SHORT (0% n=4), BND LONG (0% n=2)
β’ Many low-conviction bets drag down PF; 47% still OPEN
Takeaway: cursor_agent wins by being
selective (n=59) with balanced WR/PF. deepseek_v4 gets highest PF by going for
larger winners (avg win 4.8%) on fewer bets. grok3 dilutes its edge by casting a wider net (113 total picks).
Source:
ejaguiar1_stocks.tournament_picks. Full report:
ai-tournament.html