Discussion:
[influxdb] How to calculate memory size on high volume influx data
s***@gmail.com
2016-04-04 02:28:18 UTC
Permalink
We are trying service and system monitoring using influx db.
Our goal is to write 20,000,000 points every 5 minutes based on 20,000,000 measurements using one database.

Before doing that, we are conducting performance test. There is some issue of memory..

1st test

test condition
-100,000 measurement w/o tag
- write 10,000,000 points per 5 minutes (1,000 points per batch)

H/W
- 1 CPU (ES 2620 v3, hex core), 32G M/M, 4 x SSD(500M)

2nd test

test condition
-10,00,000 measurement w/o tag
- write 10,000,000 points per 5 minutes (1,000 points per batch)
H/W
- 1 CPU (ES 2620 v3, hex core), 64G M/M, 4 x SSD(500M)
- more measurement and memory comparing to 1st case.


Through above both case, we found memory issue.. after using main memory, influxdb used swap memory..

On 64G memory, can write 2,000,000 measurement w/o memory issue. but more than 2,000,000 measurement, this issues is still occurred.


Th test script is described in https://github.com/influxdata/influxdb/issues/6131.

Our question is how to calculate memory size depending on the number of measurements or series when we increase influx db volume.

Is there any formula or detailed guide for calculating memory size depending on number of measurement?
--
Remember to include the InfluxDB version number with all issue reports
---
You received this message because you are subscribed to the Google Groups "InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to influxdb+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/647b7ea2-4ff0-481a-b308-2e49ac9eca64%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jon Seymour
2016-04-04 06:57:32 UTC
Permalink
Post by s***@gmail.com
We are trying service and system monitoring using influx db.
Our goal is to write 20,000,000 points every 5 minutes based on 20,000,000
measurements using one database.
Before doing that, we are conducting performance test. There is some issue of memory..
1st test
test condition
-100,000 measurement w/o tag
- write 10,000,000 points per 5 minutes (1,000 points per batch)
H/W
- 1 CPU (ES 2620 v3, hex core), 32G M/M, 4 x SSD(500M)
2nd test
test condition
-10,00,000 measurement w/o tag
- write 10,000,000 points per 5 minutes (1,000 points per batch)
H/W
- 1 CPU (ES 2620 v3, hex core), 64G M/M, 4 x SSD(500M)
- more measurement and memory comparing to 1st case.
Through above both case, we found memory issue.. after using main memory,
influxdb used swap memory..
On 64G memory, can write 2,000,000 measurement w/o memory issue. but more
than 2,000,000 measurement, this issues is still occurred.
Th test script is described in
https://github.com/influxdata/influxdb/issues/6131.
Our question is how to calculate memory size depending on the number of
measurements or series when we increase influx db volume.
Is there any formula or detailed guide for calculating memory size
depending on number of measurement?
I don't think you will find a tool that produces a number than is more
accurate than a model you could create yourself, which you would
parameterise by measuring the memory used in steady state at different
write rates.

If you plan to use the data you are collecting, you also need to add query
load to the mix because there is a good chance that memory requirements of
writes will be dwarfed by the memory required to run queries concurrently
(at least, that is my experience). I'd be surprised if you could do
anything useful with 20,000,000 measurements unless the probability of
querying any given measurement was very low.

jon.
--
Remember to include the InfluxDB version number with all issue reports
---
You received this message because you are subscribed to the Google Groups "InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to influxdb+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/e2df5ca3-2f6f-4938-a99d-2c538328ccfb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
s***@gmail.com
2016-04-11 02:38:28 UTC
Permalink
Thanks for your reply..

We'll try to change our schema from 10,000,000 measurements (1 series per 1 measurement) into few measurement, many series.

As for allowing to caching of in-memory index, when we can use that feature?
--
Remember to include the InfluxDB version number with all issue reports
---
You received this message because you are subscribed to the Google Groups "InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to influxdb+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/da46e4da-9960-433f-af70-44eff43af761%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...