Alternative Data for Investing: Satellite, Web Scraping & Unconventional Sources
Guide to alternative data—satellite imagery, web traffic, credit card data, and other unconventional sources for investment insights.
What Is Alternative Data?
Alternative data = Non-traditional data used for investment insights.
Traditional data: Financial statements, economic releases, prices
Alternative data: Everything else that signals economic activity
Market size: ~$7B and growing rapidly.
Categories of Alternative Data
Satellite & Geolocation
Satellite Imagery:
- Parking lot car counts (retail traffic)
- Oil storage tank shadows (inventory)
- Crop health (agriculture)
- Construction activity
- Shipping/port activity
Providers: Orbital Insight, Planet, Descartes Labs, SpaceKnow
Geolocation Data:
- Foot traffic (SafeGraph, Placer.ai)
- Store visits
- Trade area analysis
- Competitor monitoring
Web & App Data
Web Traffic:
- Site visits (SimilarWeb)
- Search trends (Google Trends)
- App downloads/usage (Sensor Tower, App Annie)
- E-commerce activity
Web Scraping:
- Pricing data
- Job postings
- Product availability
- Review sentiment
Transaction Data
Credit/Debit Card Data:
- Consumer spending (anonymized, aggregated)
- Sector trends
- Geographic patterns
Providers: Bloomberg Second Measure, Earnest Research, Affinity Solutions
Point-of-Sale Data:
- Retailer sales
- Item-level data
- Real-time indicators
Social & Sentiment
Social Media:
- Twitter/X sentiment
- Reddit discussions
- StockTwits
- Influencer tracking
News Sentiment:
- News volume
- Tone analysis
- Topic extraction
Providers: Dataminr, Sprinklr, RavenPack
Expert & Survey
Expert Networks:
- GLG, AlphaSights, Third Bridge
- Industry expert calls
- Primary research
Proprietary Surveys:
- Consumer surveys
- Business surveys
- Custom panels
Other Alternative Data
Patent Filings: Innovation tracking
Regulatory Filings: SEC, FDA, EPA
Government Data: Permits, licenses
Weather: Agricultural, energy impact
Shipping/Logistics: Container data, AIS
How Alternative Data Is Used
Company-Level Signals
Revenue Nowcasting:
- Credit card data → quarterly sales estimate
- Web traffic → customer trends
- App usage → engagement metrics
Example: Track Target foot traffic to estimate same-store sales.
Sector/Industry Signals
Consumer Health:
- Aggregate spending patterns
- Restaurant visits
- Travel activity
Example: Airline bookings data for travel sector.
Macro Signals
Real-Time Economic Activity:
- Aggregate credit card spend → consumption
- Job postings → labor demand
- Shipping data → trade activity
Example: Satellite parking lot counts for retail sales estimate.
Evaluating Alternative Data
Key Questions
- Coverage: Does it represent the target adequately?
- History: Enough backtest data?
- Frequency: How timely?
- Accuracy: Validated against ground truth?
- Alpha decay: How long before signal commoditized?
- Compliance: Legal and ethical sourcing?
Common Pitfalls
Survivorship bias: Only current companies in sample
Backtesting issues: Data not available in real-time
Overfitting: Too many variables, too little history
Correlation ≠ Causation: Spurious relationships
Regime changes: Pandemic disrupted everything
Data Quality Considerations
Data Sourcing Issues
- Panel representativeness
- Geographic coverage
- Demographic coverage
- Opt-in bias
Processing Challenges
- Noise vs signal
- Seasonality adjustment
- Normalization
- Missing data
Compliance & Ethics
- Privacy regulations (GDPR, CCPA)
- Consent and disclosure
- Material non-public information (MNPI)
Alternative Data Vendors
Data Aggregators
- Quandl/Nasdaq Data Link: Multiple alt data sources
- Eagle Alpha: Alt data marketplace
- Neudata: Alt data scout and reviews
Sector-Specific
Consumer:
- Earnest Research (credit cards)
- Placer.ai (foot traffic)
- SimilarWeb (web traffic)
Energy:
- Orbital Insight (tank levels)
- Kpler (shipping)
- Kayrros (satellite)
Agriculture:
- Gro Intelligence
- aWhere
- Descartes Labs
DIY Alternative Data
Free sources:
- Google Trends (search interest)
- Reddit API (sentiment)
- Government data (unconventional uses)
- OpenStreetMap (location data)
Web scraping (carefully):
- Job postings
- Pricing data
- Product availability
- Review aggregation
Integration with Traditional Data
Blending Approaches
- Alternative confirms traditional: Higher conviction
- Alternative contradicts: Early warning?
- Alternative leads: Nowcasting advantage
Weighting Considerations
- Alternative data often higher frequency
- Traditional data more reliable
- Optimal blend depends on use case
Building an Alt Data Capability
Starting Points
- Google Trends: Free, easy to start
- Indeed/LinkedIn: Job posting analysis
- Web traffic tools: Free tiers available
- Satellite (free): Sentinel data available
Scaling Up
- Identify specific use case
- Evaluate vendors (trials usually available)
- Build data pipeline
- Backtest rigorously
- Monitor live performance
Team Capabilities
- Data engineering
- Statistical skills
- Domain expertise
- Compliance awareness
Pro Tips
- Start with thesis: What are you trying to predict?
- Beware overfitting: Alternative data = lots of variables
- Real-time challenge: Ensure data available when claimed
- Decay is real: Good signals get arbitraged
- Cost-benefit: Some data very expensive
- Combine sources: Multiple alt data > single source
Related Articles
China Economic Data: Complete Guide to NBS, PBoC & Alternative Sources
Navigating Chinese economic data—official sources, reliability concerns, alternative indicators, and best practices for China macro analysis.
Complete Guide to Unemployment Rate Data: BLS, OECD & More
2008 took 6 years to recover. 2020 took 18 months. The data tells a fascinating story about two very different economic crises.
Consumer Price Index (CPI) Explained: Data Sources & Methodology
A deep dive into CPI data—how it\
