SparkLogs Foundry · Cohort intake open

Bring agentic IT diagnostics to your service desk.

An invitation-only program for MSPs to work with us on a data layer for AI so engineers can close hard tickets in minutes, not hours. Built alongside the engineers who actually use it.

Free usage across all clients during Foundry period
2–4 hrs/mo commitment
Our founder Kevin Hoffman runs it directly
Ticket #4781 · same data, two paths
FILE SERVER IS INTERMITTENTLY SLOW
"Started this morning, no recent changes reported."
Manual today
Senior engineer, by hand
~60 min
With SparkLogs
Same engineer plus AI partner
~5 min
Time reclaimed12× faster
Scroll to start

The situation

The MSP business problem.

Modern MSPs are squeezed from three directions at once. Today's AI tools help with the easy tickets, not the hard ones that consume precious senior engineer time.

01

Rising expectations

Clients expect fast resolution, or proactive detection before they ever feel the impact. SLAs keep tightening. Compliance bars keep rising.

HIPAAPCI DSSSOXFISMACMMCCyber-insurance audits
02

Sprawling environments

Endpoints, cloud, SaaS, network appliances. More layers to break, more places to investigate, more vendor portals to log into.

Windows fleetHyper-V / VMwareM365 / AzureFirewall + switchRMM portals
03

Engineer scarcity

Senior techs are hard to hire and harder to keep. Repetitive diagnostic work is what burns them out fastest.

Tier 3 backlogBurnout riskHard-to-hireKnowledge concentration
To stay competitive, MSPs have to do more work, faster, across more surface area. All on the same oversubscribed senior engineering bench.

The shape of the work

How an MSP service desk works today.

Easy tickets flow smoothly. The hard ones bottleneck on senior engineers doing manual archaeology: hours per ticket, one machine at a time.

Stage 1
Ticket
entry
Stage 2
Triage &
dispatch
Stage 3
Tier 1
resolve
Stage 4
Tier 2
escalation
Stage 5
Tier 3
senior eng.
Stage 6
Vendor
escalation
What the senior engineer actually does

Manual diagnosis, ticket by ticket

Remote in. Read logs by hand. Tail perf counters. Compare to last week from memory. Hop to switch & storage. Hours per hard ticket.

Why it stays manual

Signals fragmented & ad-hoc

Event logs, vendor portals, RMM. One machine at a time, one logfile at a time. No unified view across a client, let alone a book of clients.

Where AI works today

Text-data tasks

  • Categorize and route tickets
  • Suggest fixes from past tickets
  • Surface KB articles & runbooks
Where AI is left to guess

System-state tasks

  • Diagnose a hard, escalated ticket
  • Identify the root cause of an outage
  • Pinpoint what changed and when
Scroll to continue

What we're building

SparkLogs fills the gap.

Give AI agents eyes into every client system, current and historical, so it can do the gathering and structuring work, while your engineers make every consequential decision.

Step 1

Client systems ship signals

Endpoints, servers, cloud, across every client in your book.

  • Lightweight Windows agent
  • Deploys via your RMM (MSI)
  • Outbound HTTPS only · no kernel driver
Logs + state
Step 2 · Data layer

SparkLogs cloud

Per-client logs and structured system state, aggregated with strict multi-tenant isolation.

  • Read-only MCP server
  • Token-efficient query tools
  • Every claim cites verifiable evidence
MCP queries
Step 3

MSP AI agent

Runs inside the AI tool your engineers already use.

  • Claude Code, Cursor, Copilot, Codex, Gemini
  • Skills for investigation and root-cause analysis
  • Engineers still make every action decision
01

Auto-diagnose

Performs the initial diagnostic steps a senior engineer normally does by hand.

02

Suggest causes

Returns likely root causes with confidence ratings and cited evidence. Not guesses.

03

Interactive partner

Answers follow-up questions and goes deeper as the engineer investigates.

Before vs. after

Same ticket. 12× faster.

One real-life example. The senior engineer's judgment still drives every decision. They just aren't burning an hour on data collection and reading logs.

TICKET #4781"File server is intermittently slow: started this morning, no recent changes reported."
~60 min~5 min

Manual today

~60 min · Tier 3 engineer
  1. Remote into the file server.
  2. Pull and tail Windows event logs by hand.
  3. Check perf counters: CPU, disk, network.
  4. Compare today’s metrics to last week from memory.
  5. Hop to switch & storage to rule out the network.
  6. Manually review scheduled jobs and installed software.
  7. Write up findings and a likely cause.
Senior engineer's time consumed gathering facts manually, one tool at a time.

With SparkLogs + AI

~5 min · Tier 3 + agent
  1. Agent queries SparkLogs across server + network.
  2. Agent has SparkLogs diff current state vs. the last-known-good baseline.
  3. Returns likely cause: backup job overlapping with business hours (84% confidence, cited).
  4. Engineer asks: "Show me the top 5 IO waits since 8:30"
  5. Engineer confirms cause and reschedules the backup job.
Agent does the gathering and structuring. Engineer keeps every decision.

Six properties, designed together

What we're actually building.

None of these are optional. The platform is only useful for MSP diagnosis if every one of them holds.

01 · Endpoint

Managed agent

Lightweight Windows agent deploys via your RMM. No kernel driver, no remote-execute, outbound HTTPS only.

02 · Data

Captures system health

Software inventory, processes, services, drivers, certs, network state, performance. Not just logs.

03 · AI

Integrated with AI hosts (MCP)

Plugs into Claude Code, Cursor, Copilot, Codex via MCP. Engineer brings their own AI tool of choice.

04 · Posture

Read-only by design

The AI queries data. The agent never executes on the endpoint. Your engineer keeps every action decision.

05 · Trust

Every claim cited

Each finding links to a verifiable evidence URL one click away. Hypotheses are a separate, opt-in step.

06 · Governance

Audit-ready

Every investigation produces a complete audit trail. Service managers review patterns; auditors trace evidence.

Honest scope

Where we are today.

We'd rather ship a narrow product done well than a broad one done poorly. Here's what's live, what's building now in the Foundry, and what's deliberately deferred.

Live

The platform

In production today.
  • Petabyte-scale log management
  • Multi-tenant by design: per-client orgs
  • 5–10× lower cost than typical SIEMs
  • Data in 5 regions: US, CA, EU, UK, AU
  • 100s of data sources: HTTPS, syslog, OTLP, Elastic, Loki, OSS log shippers
  • Schemaless ingest · search any age of data
  • Archive to your own object storage
  • HIPAA / PCI / SOX / FISMA-grade retention
Building now · Foundry

The AI diagnostic layer

Co-developed with our first MSP cohort.
  • Managed Windows agent (MSI · RMM-friendly · minimal system resources)
  • Read-only MCP server with token-efficient querying
  • /sparklogs-investigate
  • /sparklogs-analyze-cause
  • Cited findings · audit trail per investigation
  • Claude Code primary; Cursor / Codex / Gemini / Copilot rolling in
Coming later

Deferred on purpose

Sequenced after v1 lands.
  • macOS & Linux managed agents
  • External data: M365, Azure, EDR, RMM APIs
  • Proactive anomaly alerts + AI detective
  • Cloud-based syslog ingestion
  • Cross-host (Hyper-V / VMware) correlation

SparkLogs Foundry · Early access

Join the
Foundry.

Bring agentic IT diagnostics to your service desk, alongside the engineers who use them every day. Free SparkLogs usage, direct founder access, permanent Foundry Partner designation.

1
Fill out the application.
2
Kevin replies personally. Brief intro call to align on fit.
3
Onboard your fleet via your RMM and start shaping the product.
Kevin Hoffman

Kevin Hoffman, PhD

Founder & CEO, SparkLogs · Previously co-founder, Axcient

Apply to the Foundry

Tell us about your MSP.

Kevin replies to every application personally. Expect to hear back within two business days.

By applying you agree to receive program updates from SparkLogs. No spam, easy unsubscribe.