Log0

Running Locally

Step-by-step guide to run the full log0 pipeline on your machine - Docker infrastructure, environment variables, all seven services, health checks, log submission, and Kafka/PostgreSQL inspection.

Prerequisites

Make sure these are installed before you begin:

ToolVersionNotes
Java25+java -version to verify
Maven3.9.12+Included via mvnw wrapper - no separate install needed
Docker DesktopLatestMust be running before Step 1
GitAnyTo clone the repo
Postman (optional)LatestFor the Postman collection in refs/postman/

Step 1 - Start Infrastructure

Start Kafka, Zookeeper, PostgreSQL, and ClickHouse via Docker Compose. This also auto-creates all required Kafka topics.

cd docker
docker-compose up -d
cd docker
docker-compose up -d
cd docker
docker-compose up -d

Verify all containers are running:

docker ps
docker ps
docker ps

You should see log0-kafka, log0-zookeeper, log0-postgres, and log0-clickhouse in the output. The log0-kafka-setup container will run once to create topics and exit - that's expected.

Verify ClickHouse is up:

curl http://localhost:8123/ping
# Expected: Ok.
Invoke-RestMethod http://localhost:8123/ping
# Expected: Ok.
curl http://localhost:8123/ping
REM Expected: Ok.

Step 2 - Environment Variables

The notification-service requires Slack credentials, the ai-service requires an LLM provider API key, and the auth-service requires a JWT signing secret at startup. The other services need no additional configuration.

Getting Your Slack Credentials

Step 2a - Create a Slack App:

  1. Go to api.slack.com/appsCreate New AppFrom Scratch
  2. Name it (e.g., log0-local) and select your workspace → Create App

Step 2b - Add Bot Token Scopes:

  1. In your app settings, go to OAuth & PermissionsBot Token Scopes
  2. Add the scope: chat:write
  3. Click Install to WorkspaceAllow
  4. Copy the Bot User OAuth Token - it starts with xoxb-

Step 2c - Get Your Channel ID:

  1. In Slack, open the channel where you want notifications
  2. Click the channel name at the top → scroll down → Copy channel ID (looks like C0XXXXXXXXX)
  3. Invite the bot to the channel: /invite @log0-local

Creating the .env File

Create a .env file inside services/notification-service/:

cd services/notification-service
cp .env.example .env
nano .env   # or vim, code, etc.

Fill in your real values:

SLACK_BOT_TOKEN=xoxb-your-actual-token
SLACK_CHANNEL_ID=C0XXXXXXXXX
cd services\notification-service
Copy-Item .env.example .env
notepad .env

Fill in your real values:

SLACK_BOT_TOKEN=xoxb-your-actual-token
SLACK_CHANNEL_ID=C0XXXXXXXXX
cd services\notification-service
copy .env.example .env
notepad .env

Fill in your real values:

SLACK_BOT_TOKEN=xoxb-your-actual-token
SLACK_CHANNEL_ID=C0XXXXXXXXX

The .env file is gitignored - your credentials will never be committed.

Running without Slack: You can skip this step and run the notification-service without real credentials. It will start but log {"ok":false,"error":"invalid_auth"} for each Slack call. The rest of the pipeline (ingestion → normalization → clustering → incidents) works fully without Slack.

AI Service credentials

The ai-service uses Groq by default - free tier, no credit card required. Get a key at console.groq.com, then:

cd services/ai-service
cp .env.example .env
nano .env   # or vim, code, etc.

Fill in your Groq key:

GROQ_API_KEY=gsk_your-actual-key-here
cd services\ai-service
Copy-Item .env.example .env
notepad .env

Fill in your Groq key:

GROQ_API_KEY=gsk_your-actual-key-here
cd services\ai-service
copy .env.example .env
notepad .env

Fill in your Groq key:

GROQ_API_KEY=gsk_your-actual-key-here

The other three keys (OPENAI_API_KEY, GEMINI_API_KEY, ANTHROPIC_API_KEY) are optional - only needed if you switch ai.provider in services/ai-service/src/main/resources/application.yml.

Auth Service credentials

The auth-service requires a JWT_SECRET - a random string of at least 32 characters used to sign and verify JWT tokens with HS256. Generate one and set it in services/auth-service/.env:

Step 1 - Generate a secret:

openssl rand -base64 48

Or using only built-in tools:

cat /dev/urandom | tr -dc 'A-Za-z0-9' | head -c 48
-join ((65..90) + (97..122) + (48..57) | Get-Random -Count 48 | % {[char]$_})

This picks 48 random characters from A–Z, a–z, 0–9 and joins them into a single string. Copy the output - that's your secret.

CMD does not have a built-in random string generator. Use PowerShell instead (run powershell to open it), or manually type any 48-character random string.

Step 2 - Create the .env file:

cd services/auth-service
cp .env.example .env
nano .env   # or vim, code, etc.

Fill in your generated secret:

JWT_SECRET=your-generated-48-char-secret-here
cd services\auth-service
Copy-Item .env.example .env
notepad .env

Fill in your generated secret:

JWT_SECRET=your-generated-48-char-secret-here
cd services\auth-service
copy .env.example .env
notepad .env

Fill in your generated secret:

JWT_SECRET=your-generated-48-char-secret-here

The .env file is gitignored - your secret will never be committed.

Why at least 32 characters? HS256 (HMAC-SHA256) requires a key of at least 256 bits (32 bytes). Shorter secrets are rejected by the JWT library at startup. 48 characters from the alphanumeric set gives ~285 bits of entropy - comfortably above the minimum.


Step 3 - Start All Services

Open 5 separate terminals - one per service. All commands run from the project root (log0-services/).

Terminal 1 - Ingestion Gateway (port 8080)

cd services/ingestion-gateway
./mvnw spring-boot:run
cd services\ingestion-gateway
.\mvnw.cmd spring-boot:run
cd services\ingestion-gateway
mvnw.cmd spring-boot:run

Terminal 2 - Normalization Service (port 8081)

cd services/normalization-service
./mvnw spring-boot:run
cd services\normalization-service
.\mvnw.cmd spring-boot:run
cd services\normalization-service
mvnw.cmd spring-boot:run

Terminal 3 - Clustering Service (port 8082)

cd services/clustering-service
./mvnw spring-boot:run
cd services\clustering-service
.\mvnw.cmd spring-boot:run
cd services\clustering-service
mvnw.cmd spring-boot:run

Terminal 4 - Incident Service (port 8083)

cd services/incident-service
./mvnw spring-boot:run
cd services\incident-service
.\mvnw.cmd spring-boot:run
cd services\incident-service
mvnw.cmd spring-boot:run

Terminal 5 - Notification Service (port 8084)

This service needs Slack env vars loaded before it starts.

cd services/notification-service
export $(cat .env | xargs)
./mvnw spring-boot:run
cd services\notification-service
Get-Content .env | ForEach-Object { $k,$v = $_ -split '=',2; [System.Environment]::SetEnvironmentVariable($k,$v) }
.\mvnw.cmd spring-boot:run
cd services\notification-service
for /f "tokens=1,2 delims==" %i in (.env) do set %i=%j
mvnw.cmd spring-boot:run

Terminal 6 - AI Service (port 8085)

This service needs the Groq API key loaded before it starts.

cd services/ai-service
export $(cat .env | grep -v '^#' | xargs)
./mvnw spring-boot:run
cd services\ai-service
Get-Content .env | Where-Object { $_ -notmatch '^#' -and $_ -match '=' } | ForEach-Object { $k,$v = $_ -split '=',2; Set-Item "env:$k" $v }
.\mvnw.cmd spring-boot:run
cd services\ai-service
for /f "tokens=1,2 delims==" %i in (.env) do @if not "%i:~0,1%"=="#" set "%i=%j"
mvnw.cmd spring-boot:run

Terminal 7 - Auth Service (port 8086)

This service needs the JWT_SECRET loaded before it starts.

cd services/auth-service
export $(cat .env | grep -v '^#' | xargs)
./mvnw spring-boot:run
cd services\auth-service
Get-Content .env | Where-Object { $_ -notmatch '^#' -and $_ -match '=' } | ForEach-Object { $k,$v = $_ -split '=',2; Set-Item "env:$k" $v }
.\mvnw.cmd spring-boot:run
cd services\auth-service
for /f "tokens=1,2 delims==" %i in (.env) do @if not "%i:~0,1%"=="#" set "%i=%j"
mvnw.cmd spring-boot:run

Step 4 - Health Checks

Once all services have printed Started ... in ... seconds, verify they are all up:

curl http://localhost:8080/actuator/health
curl http://localhost:8082/actuator/health
curl http://localhost:8083/actuator/health
curl http://localhost:8084/actuator/health
curl http://localhost:8085/actuator/health
curl http://localhost:8086/actuator/health
Invoke-RestMethod http://localhost:8080/actuator/health
Invoke-RestMethod http://localhost:8082/actuator/health
Invoke-RestMethod http://localhost:8083/actuator/health
Invoke-RestMethod http://localhost:8084/actuator/health
Invoke-RestMethod http://localhost:8085/actuator/health
Invoke-RestMethod http://localhost:8086/actuator/health
curl http://localhost:8080/actuator/health
curl http://localhost:8082/actuator/health
curl http://localhost:8083/actuator/health
curl http://localhost:8084/actuator/health
curl http://localhost:8085/actuator/health
curl http://localhost:8086/actuator/health

Every endpoint should return {"status":"UP"}.

Note: normalization-service runs on port 8081 but its actuator is not configured by default. If http://localhost:8081/actuator/health returns a connection error, the service is still healthy as long as it's consuming from raw-logs (visible in the terminal logs).

ServicePortRole
ingestion-gateway8080Receives logs from clients
normalization-service8081Parses, fingerprints, and stores logs to ClickHouse
clustering-service8082Counts occurrences, triggers incidents
incident-service8083Stores incidents, exposes REST API (incl. log query)
notification-service8084Sends Slack alerts
ai-service8085Generates AI summaries via LLM
auth-service8086Tenant registration, JWT auth, API key management

Infrastructure:

ContainerPortRole
log0-kafka9092Event streaming
log0-zookeeper2181Kafka coordination
log0-postgres5433Incident & tenant data
log0-clickhouse8123 (HTTP), 9000 (native)Log event storage

Step 5 - Submit a Log

Send a single test log to confirm the ingestion gateway accepts it. A 202 Accepted means the log was published to Kafka.

curl -X POST http://localhost:8080/api/v1/logs \
  -H "Content-Type: application/json" \
  -H "X-TENANT-ID: 550e8400-e29b-41d4-a716-446655440000" \
  -H "X-SERVICE-NAME: payment-service" \
  -H "X-ENVIRONMENT: production" \
  -H "X-API-KEY: test-key-123" \
  -d '{
    "timestamp": "2026-03-28T10:30:00Z",
    "level": "ERROR",
    "message": "Database connection timeout after 30s",
    "trace": "java.sql.SQLException: Connection timeout"
  }'
$body = @{
    timestamp = "2026-03-28T10:30:00Z"
    level     = "ERROR"
    message   = "Database connection timeout after 30s"
    trace     = "java.sql.SQLException: Connection timeout"
} | ConvertTo-Json

Invoke-RestMethod -Uri "http://localhost:8080/api/v1/logs" `
  -Method POST `
  -ContentType "application/json" `
  -Headers @{
      "X-TENANT-ID"    = "550e8400-e29b-41d4-a716-446655440000"
      "X-SERVICE-NAME" = "payment-service"
      "X-ENVIRONMENT"  = "production"
      "X-API-KEY"      = "test-key-123"
  } `
  -Body $body
curl -X POST http://localhost:8080/api/v1/logs ^
  -H "Content-Type: application/json" ^
  -H "X-TENANT-ID: 550e8400-e29b-41d4-a716-446655440000" ^
  -H "X-SERVICE-NAME: payment-service" ^
  -H "X-ENVIRONMENT: production" ^
  -H "X-API-KEY: test-key-123" ^
  -d "{\"timestamp\":\"2026-03-28T10:30:00Z\",\"level\":\"ERROR\",\"message\":\"Database connection timeout after 30s\",\"trace\":\"java.sql.SQLException: Connection timeout\"}"

Trigger an incident: Send the same request 10+ times. The clustering-service counts occurrences per fingerprint and publishes to incident-events once the threshold is reached. Use Postman Collection Runner with Iterations = 12 for convenience.

Verify AI summary: After triggering an incident, connect to PostgreSQL and check that ai_summary was populated:

SELECT incident_id, service_name, status, ai_summary
FROM incident
ORDER BY created_at DESC
LIMIT 5;

The first incident row should contain a structured summary from the LLM within a few seconds of creation.

Manual AI service test (isolated)

You can also test ai-service independently without going through the full pipeline. With ai-service running, send a direct POST:

curl -X POST http://localhost:8085/api/v1/summaries \
  -H "Content-Type: application/json" \
  -d '{
    "incidentId": "00000000-0000-0000-0000-000000000001",
    "tenantId": "test-tenant",
    "serviceName": "payment-service",
    "environment": "production",
    "severity": "HIGH",
    "occurrenceCount": 42,
    "firstSeenAt": "2024-01-01T10:00:00Z",
    "lastSeenAt": "2024-01-01T11:00:00Z",
    "topMessages": [
      "Connection refused: db-host:5432",
      "Timeout waiting for connection from pool"
    ]
  }'
Invoke-RestMethod -Uri "http://localhost:8085/api/v1/summaries" -Method POST -ContentType "application/json" -Body '{"incidentId":"00000000-0000-0000-0000-000000000001","tenantId":"test-tenant","serviceName":"payment-service","environment":"production","severity":"HIGH","occurrenceCount":42,"firstSeenAt":"2024-01-01T10:00:00Z","lastSeenAt":"2024-01-01T11:00:00Z","topMessages":["Connection refused: db-host:5432","Timeout waiting for connection from pool"]}'
curl -X POST http://localhost:8085/api/v1/summaries -H "Content-Type: application/json" -d "{\"incidentId\":\"00000000-0000-0000-0000-000000000001\",\"tenantId\":\"test-tenant\",\"serviceName\":\"payment-service\",\"environment\":\"production\",\"severity\":\"HIGH\",\"occurrenceCount\":42,\"firstSeenAt\":\"2024-01-01T10:00:00Z\",\"lastSeenAt\":\"2024-01-01T11:00:00Z\",\"topMessages\":[\"Connection refused: db-host:5432\",\"Timeout waiting for connection from pool\"]}"

Expected response: 202 Accepted. Check the ai-service logs - you should see the Groq API call succeed, then a callback attempt to incident-service (which will fail with a connection error if incident-service is not running - that's expected and handled gracefully).


Step 6 - Watch Kafka Topics

Open additional terminals to watch messages flowing through the pipeline in real time.

raw-logs (ingestion output)

docker exec -it log0-kafka kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic raw-logs \
  --from-beginning \
  --property print.key=true
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic raw-logs --from-beginning --property print.key=true
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic raw-logs --from-beginning --property print.key=true

normalized-logs (after normalization + fingerprinting)

docker exec -it log0-kafka kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic normalized-logs \
  --from-beginning \
  --property print.key=true
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic normalized-logs --from-beginning --property print.key=true
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic normalized-logs --from-beginning --property print.key=true

incident-events (fires after 10 occurrences of the same fingerprint)

docker exec -it log0-kafka kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic incident-events \
  --from-beginning \
  --property print.key=true
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic incident-events --from-beginning --property print.key=true
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic incident-events --from-beginning --property print.key=true

notification-events (triggers Slack messages)

docker exec -it log0-kafka kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic notification-events \
  --from-beginning \
  --property print.key=true
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic notification-events --from-beginning --property print.key=true
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic notification-events --from-beginning --property print.key=true

raw-logs-dlq (dead letter queue - should stay empty)

docker exec -it log0-kafka kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic raw-logs-dlq \
  --from-beginning
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic raw-logs-dlq --from-beginning
docker exec -it log0-kafka kafka-console-consumer --bootstrap-server localhost:9092 --topic raw-logs-dlq --from-beginning

Check consumer group lag

Zero lag means all services are caught up. Lag > 0 means a service is behind or crashed.

for group in normalization-service clustering-service incident-service notification-service; do
  echo "--- $group ---"
  docker exec -it log0-kafka kafka-consumer-groups \
    --bootstrap-server localhost:9092 \
    --describe --group $group
done
docker exec -it log0-kafka kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group normalization-service
docker exec -it log0-kafka kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group clustering-service
docker exec -it log0-kafka kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group incident-service
docker exec -it log0-kafka kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group notification-service
docker exec -it log0-kafka kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group normalization-service
docker exec -it log0-kafka kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group clustering-service
docker exec -it log0-kafka kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group incident-service
docker exec -it log0-kafka kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group notification-service

Step 7 - Inspect PostgreSQL

PostgreSQL is exposed on port 5433 (the container runs on 5432 internally; Docker maps it to 5433 on your host).

Connect to the database

docker exec -it log0-postgres psql -U log0 -d log0
docker exec -it log0-postgres psql -U log0 -d log0
docker exec -it log0-postgres psql -U log0 -d log0

Useful queries

Once inside psql, run these:

-- List all incidents (newest first)
SELECT incident_id, service_name, severity, status, occurrence_count, created_at
FROM incident
ORDER BY created_at DESC
LIMIT 10;

-- Check state transitions for a specific incident
SELECT from_status, to_status, changed_at
FROM incident_state_history
WHERE incident_id = '<your-incident-id>'
ORDER BY changed_at;

-- Count incidents per tenant
SELECT tenant_id, COUNT(*) AS total
FROM incident
GROUP BY tenant_id;

-- Exit psql
\q

Connect with an external client (pgAdmin, DBeaver, TablePlus)

SettingValue
Hostlocalhost
Port5433
Databaselog0
Usernamelog0
Passwordlog0

Step 8 - Inspect ClickHouse

ClickHouse stores every normalized log event and is queried by the incident-service to return the raw logs behind each incident. It runs on HTTP port 8123.

Query log events

# All log events (newest first)
curl -s "http://localhost:8123/?user=log0&password=log0" \
  --data "SELECT event_id, tenant_id, level, message, fingerprint, timestamp FROM log0.log_events ORDER BY timestamp DESC LIMIT 10"

# Count by fingerprint (shows which errors repeat most)
curl -s "http://localhost:8123/?user=log0&password=log0" \
  --data "SELECT fingerprint, level, count() as hits FROM log0.log_events GROUP BY fingerprint, level ORDER BY hits DESC"

# Logs for a specific incident fingerprint
curl -s "http://localhost:8123/?user=log0&password=log0" \
  --data "SELECT event_id, message, timestamp FROM log0.log_events WHERE fingerprint = '<your-fingerprint>' ORDER BY timestamp DESC LIMIT 20"
# All log events (newest first)
Invoke-WebRequest -Uri "http://localhost:8123/?user=log0&password=log0" -Method POST `
  -Body "SELECT event_id, tenant_id, level, message, fingerprint, timestamp FROM log0.log_events ORDER BY timestamp DESC LIMIT 10" `
  -UseBasicParsing | Select-Object -ExpandProperty Content

# Count by fingerprint (shows which errors repeat most)
Invoke-WebRequest -Uri "http://localhost:8123/?user=log0&password=log0" -Method POST `
  -Body "SELECT fingerprint, level, count() as hits FROM log0.log_events GROUP BY fingerprint, level ORDER BY hits DESC" `
  -UseBasicParsing | Select-Object -ExpandProperty Content

# Logs for a specific incident fingerprint
Invoke-WebRequest -Uri "http://localhost:8123/?user=log0&password=log0" -Method POST `
  -Body "SELECT event_id, message, timestamp FROM log0.log_events WHERE fingerprint = '<your-fingerprint>' ORDER BY timestamp DESC LIMIT 20" `
  -UseBasicParsing | Select-Object -ExpandProperty Content
curl -s "http://localhost:8123/?user=log0&password=log0" --data "SELECT event_id, level, message, fingerprint, timestamp FROM log0.log_events ORDER BY timestamp DESC LIMIT 10"

Query via incident API

Once you have an incident ID, you can retrieve its raw log events directly through the incident-service REST API:

curl "http://localhost:8083/api/v1/incidents/<incident-id>/logs?tenantId=<tenant-id>&page=0&size=50"
Invoke-RestMethod "http://localhost:8083/api/v1/incidents/<incident-id>/logs?tenantId=<tenant-id>&page=0&size=50"
curl "http://localhost:8083/api/v1/incidents/<incident-id>/logs?tenantId=<tenant-id>&page=0&size=50"

Returns a JSON array of log events ordered newest-first. Supports page (0-based) and size (max 200) query params.

Connect with an external client (DBeaver, DataGrip, TablePlus)

SettingValue
Hostlocalhost
Port8123 (HTTP) or 9000 (native TCP)
Databaselog0
Usernamelog0
Passwordlog0
DriverClickHouse

Pipeline at a Glance

POST /api/v1/logs (8080)
  → raw-logs (Kafka)
    → normalization-service (8081)
      ├── → ClickHouse log_events table (log storage)
      └── → normalized-logs (Kafka)
            → clustering-service (8082)
              → incident-events (Kafka)  ← fires at 10 occurrences
                → incident-service (8083) → PostgreSQL (incident table)
                    ├── → POST /api/v1/summaries (8085, async)
                    │       → ai-service → Groq LLM
                    │           → PATCH /api/v1/incidents/{id}/ai-summary (8083)
                    │               → incident.ai_summary saved to PostgreSQL
                    └── → notification-events (Kafka)
                              → notification-service (8084) → Slack

GET /api/v1/incidents/{id}/logs (8083)
  → ClickHouse log_events (query by tenant_id + fingerprint)

A full end-to-end test (10 logs → Slack message + AI summary) takes about 3–5 seconds under normal load.

How is this guide?

On this page