Julius Volz
2015-03-29 19:48:32 UTC
Hi,
After initially test-driving InfluxDB[0][1] a year ago as an option for
long-term storage of Prometheus data, I just gave it another try with the
new tag support in InfluxDB 0.9.0. Paul had mentioned in
https://news.ycombinator.com/item?id=9001808 that this will fit the
Prometheus data model much better. I think it does improve things in terms
of being able to select more efficiently by tags (previously tag-style data
was stored in columns).
However, I'm still getting very high disk usage when I live-replicate all
metrics from a Prometheus server into InfluxDB. After running some standard
metric ingestion tests for about an hour, I get:
Prometheus: 13MB
InfluxDB: 180MB
Blowup factor: ~14x.
These are some typical example "/write" requests I'm sending to InfluxDB:
https://gist.github.com/juliusv/d3a430d1ef943c73f0ef
Is there something I'm doing wrong/inefficiently that could negatively
impact disk usage? Am I abusing InfluxDB's data model too much by trying to
squeeze Prometheus-style metrics[2] into it? Essentially, our data model is
OpenTSDB-like, except that we'd prefer not having to run OpenTSDB, but
something more modern, without the need for Hadoop/HBase. I guess one
problem for our case will be that InfluxDB is at its heart still a log
store for arbitrary sets of key/value columns per entry, instead of
optimized around purely numeric time series data? Would there be any way of
making this work better?
I built InfluxDB from current HEAD (caf3259) and started it simply by
running "./influxd" as a single node, without any further configuration.
Cheers,
Julius
/BCC: prometheus-***@googlegroups.com
[0]
http://prometheus.io/docs/introduction/comparison/#prometheus-vs.-influxdb
[1]
https://docs.google.com/document/d/1OgnI7YBCT_Ub9Em39dEfx9BuiqRNS3oA62i8fJbwwQ8/edit#heading=h.e32xcwnzxp3e
[2] http://prometheus.io/docs/concepts/data_model/
After initially test-driving InfluxDB[0][1] a year ago as an option for
long-term storage of Prometheus data, I just gave it another try with the
new tag support in InfluxDB 0.9.0. Paul had mentioned in
https://news.ycombinator.com/item?id=9001808 that this will fit the
Prometheus data model much better. I think it does improve things in terms
of being able to select more efficiently by tags (previously tag-style data
was stored in columns).
However, I'm still getting very high disk usage when I live-replicate all
metrics from a Prometheus server into InfluxDB. After running some standard
metric ingestion tests for about an hour, I get:
Prometheus: 13MB
InfluxDB: 180MB
Blowup factor: ~14x.
These are some typical example "/write" requests I'm sending to InfluxDB:
https://gist.github.com/juliusv/d3a430d1ef943c73f0ef
Is there something I'm doing wrong/inefficiently that could negatively
impact disk usage? Am I abusing InfluxDB's data model too much by trying to
squeeze Prometheus-style metrics[2] into it? Essentially, our data model is
OpenTSDB-like, except that we'd prefer not having to run OpenTSDB, but
something more modern, without the need for Hadoop/HBase. I guess one
problem for our case will be that InfluxDB is at its heart still a log
store for arbitrary sets of key/value columns per entry, instead of
optimized around purely numeric time series data? Would there be any way of
making this work better?
I built InfluxDB from current HEAD (caf3259) and started it simply by
running "./influxd" as a single node, without any further configuration.
Cheers,
Julius
/BCC: prometheus-***@googlegroups.com
[0]
http://prometheus.io/docs/introduction/comparison/#prometheus-vs.-influxdb
[1]
https://docs.google.com/document/d/1OgnI7YBCT_Ub9Em39dEfx9BuiqRNS3oA62i8fJbwwQ8/edit#heading=h.e32xcwnzxp3e
[2] http://prometheus.io/docs/concepts/data_model/
--
You received this message because you are subscribed to the Google Groups "InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to influxdb+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at http://groups.google.com/group/influxdb.
To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/CAJeHL5d27r9juncHgkd8sWVVgUoYy0YyyQdrENxr-sggHPEYEA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to influxdb+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at http://groups.google.com/group/influxdb.
To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/CAJeHL5d27r9juncHgkd8sWVVgUoYy0YyyQdrENxr-sggHPEYEA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.