Building my own log aggregation and search server

Kryesh@lemmy.world · edit-2 3 days ago

Building my own log aggregation and search server

pineapple@lemmy.ml · 2 days ago

I don’t really understand the point of this. What kind of logs are you storing and why would you want to?

Possibly linux@lemmy.zip · 2 days ago

Threat detection

pineapple@lemmy.ml · 2 days ago

Why do logs help with threat detection?

Kryesh@lemmy.world · edit-2 2 days ago

Applications like metrics because they’re good for doing statistics so you can figure out things like “is this endpoint slow” or “how much traffic is there”

Security teams like logs because they answer questions like “who logged in to this host between these times?” Or “when did we receive a weird looking http request”, basically any time you want to find specific details about a single event logs are typically better; and threat hunting does a lot of analysis on specific one time events.

Logs are also helpful when troubleshooting, metrics can tell you there’s a problem but in my experience you’ll often need logs to actually find out what the problem is so you can fix it.

farcaller@fstab.sh · 3 days ago

This looks nice, but there’s plenty free alternatives in this space which warrants a section in the readme with the comparison to other products.

You mention ram usage, but it’s oftentimes a product of event size. Based on your numbers, your average event size is about 800 bytes. Let’s call it 1kb. That’s one million events per day. It’s surely sounds more promising than Elastic, but not reaching Loki numbers, or, if you focus on efficiency, is way behind Victoriametrics Logs (based on peeking at their benches).

I think the important bits you need to add is how you store the logs (i.e. which indices you build) and what are your trade-offs. Grep is an efficient logs processor which barely uses any ram but incurs dramatic I/O costs, after all.

Enterprises will be looking at different numbers and they have lots of SaaS products to choose from. Homelab users are absolutely your target audience and you can have it by making a better UI than the alternative (victoriametrics logs aren’t that comfortable to work with) or making resource usage lower (people run k8s clusters on RPis, they sure wonder about every megabyte of ram lost) or making the deployment easier (fire and forget, and when you come to it, it works).

It sounds like lots of things and I don’t want to be discouraging. What you started there is really nice-looking. Good job!