Ideas / random
- Metrics shouldn’t lie, if you don’t have a metric don’t try to “guess” what is missing
- Zero value and no data are two different things. This type of information is useful when oncall, don’t build tools that default to zero.
- You can understand an Event by just looking at its data, while Metrics must be aggregatted (e.g. plotted with 15 minutes data) (https://dzone.com/articles/what-is-the-difference-between-metrics-and-events)
Time series/other databases
- InfluxDB
- InfluxDB is an open source time series platform. This includes APIs for storing and querying data, processing it in the background for ETL or monitoring and alerting purposes, user dashboards, and visualizing and exploring the data and more. The master branch on this repo now represents the latest InfluxDB, which now includes functionality for Kapacitor (background processing) and Chronograf (the UI) all in a single binary.
- https://druid.apache.org/
- Apache Druid is a high performance real-time analytics database. Druid is designed for workflows where fast ad-hoc analytics, instant data visibility, or supporting high concurrency is important. As such, Druid is often used to power UIs where an interactive, consistent user experience is desired.
Handling Data
- https://modin.readthedocs.io/en/latest/ (Scale your pandas workflow by changing a single line of code)
- Modin uses Ray or Dask to provide an effortless way to speed up your pandas notebooks, scripts, and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing pandas code. Even using the DataFrame constructor is identical. To use Modin, you do not need to know how many cores your system has and you do not need to specify how to distribute the data. In fact, you can continue using your previous pandas notebooks while experiencing a considerable speedup from Modin, even on a single machine. Once you’ve changed your import statement, you’re ready to use Modin just like you would pandas.
Events
“In contrast to metrics, events are snapshots of what happened at a particular point-in-time.” from https://learning.oreilly.com/library/view/observability-engineering/9781492076438/
- https://dzone.com/articles/what-is-the-difference-between-metrics-and-events (<-- very good intro to what is events)
- https://docs.datadoghq.com/events/
- https://docs.aws.amazon.com/health/latest/ug/cloudwatch-events-health.html
- https://docs.datadoghq.com/monitors/create/types/event/
- https://www.eventstore.com/eventstoredb
Twitter thread (by me)
Still on this, you may literally emit an Event using a library or you can log a line at the end of the work (e.g. if it’s a webserver just before sending the response to the user) with all the information of the Event.
So if you are using Splunk to filter the logs you can search for that line (e.g. RequestAPIEvent) and you will find all the information you need. Splunk also has great features in its language that helps you to extract the information you need from that line 😃
(source https://twitter.com/elias_era/status/1483858337640894464)
Possible Databases
- https://www.eventstore.com/eventstoredb
- https://cloudoki.com/event-logging-elasticsearch/
- https://www.elastic.co/elasticsearch/
- https://www.elastic.co/guide/en/elasticsearch/reference/current/watching-meetup-data.html
Traces
- B3 Propagation headers: https://github.com/openzipkin/b3-propagation