Skip to content

write_tsdb with metadata #1107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

ymettier
Copy link
Contributor

@ymettier ymettier commented Jul 2, 2015

Hello,

First of all, PR #1106 is a pre-requisite for this PR. This is why I based my work on that branch.

This patch allows use metadata to completely rewrite some metrics for OpenTSDB.
The problem is best explained in issue #709 and #887 : some metric names should not include the instance in their name but in a tag. For example, cpu.1.cpu.user should become cpu.user with an additional tag cpu=1

This patch is a proposition of a solution for this problem.

I added some metadata :

metadataexplanation
tsdb_prefixWill prefix the OpenTSDB <metric> (also prefix tsdb_id if defined)
tsdb_idReplace the metric with this tag
tsdb_tag_plugin
tsdb_tag_pluginInstance
tsdb_tag_type
tsdb_tag_typeInstance
tsdb_tag_dsname
When defined, tsdb_tag_* removes the related item from metric id.
If it is not empty, it will be the key of an opentsdb tag (the value is the item itself)
If it is empty, no tag is defined.

Here is how I use them for commonly used plugins cpu, df, disk, interface, load and swap :

<Chain "PreCache">
   <Rule "opentsdb_cpu">
     <Match "regex">
       Plugin "^cpu$"
     </Match>
     <Target "set">
       MetaDataSet "tsdb_tag_pluginInstance" "cpu"
       MetaDataSet "tsdb_tag_type" ""
       MetaDataSet "tsdb_prefix" "sys."
     </Target>
   </Rule>
   <Rule "opentsdb_df">
     <Match "regex">
       Plugin "^df$"
     </Match>
     <Target "set">
       MetaDataSet "tsdb_tag_pluginInstance" "mount"
       MetaDataSet "tsdb_tag_type" ""
       MetaDataSet "tsdb_prefix" "sys."
     </Target>
   </Rule>
   <Rule "opentsdb_disk">
     <Match "regex">
       Plugin "^disk$"
     </Match>
     <Target "set">
       MetaDataSet "tsdb_tag_pluginInstance" "disk"
       MetaDataSet "tsdb_prefix" "sys."
     </Target>
   </Rule>
   <Rule "opentsdb_interface">
     <Match "regex">
       Plugin "^interface$"
     </Match>
     <Target "set">
       MetaDataSet "tsdb_tag_pluginInstance" "iface"
       MetaDataSet "tsdb_prefix" "sys."
     </Target>
   </Rule>
   <Rule "opentsdb_load">
     <Match "regex">
       Plugin "^loac$"
     </Match>
     <Target "set">
       MetaDataSet "tsdb_tag_type" ""
       MetaDataSet "tsdb_prefix" "sys."
     </Target>
   </Rule>
   <Rule "opentsdb_swap">
     <Match "regex">
       Plugin "^swap$"
     </Match>
     <Target "set">
       MetaDataSet "tsdb_prefix" "sys."
     </Target>
   </Rule>
 </Chain>

Of course, when you do not define metadata, it works just as before.

Regards,
Yves

This was referenced Jul 2, 2015
@ymettier
Copy link
Contributor Author

ymettier commented Jul 7, 2015

Hello,

Today's patch 32ce87f adds a new metadata tag :

metadataexplanation
tsdb_tagAdd a literal "tagk=tagv" tag

This is useful for cases when pluginInstance (or typeInstance) may be set or not.

Example :

LoadPlugin processes
<Plugin processes>
  ProcessMatch "collectd-agent" ".*/sbin/collectd.*collectd.conf.*-f.*"
</Plugin>

<Chain "OpenTSDB">
  <Rule "opentsdb_processes_one">
    <Match "regex">
      Plugin "^processes$"
      PluginInstance "^."
    </Match>
    <Target "set">
      MetaDataSet "tsdb_tag_pluginInstance" "process"
    </Target>
  </Rule>
  <Rule "opentsdb_processes_all">
    <Match "regex">
      Plugin "^processes$"
      PluginInstance "^$"
    </Match>
    <Target "set">
      MetaDataSet "tsdb_tag" "process=all"
    </Target>
  </Rule>
</Chain>

In this example, when the pluginInstance is set (e.g. when matching the collectd-agent process), the pluginInstance will be removed and we will have a tag process=collectd-agent (rule opentsdb_processes_one).
However, the processes plugin also monitor all processes information and sends metrics with pluginInstance not set. In this case (rule opentsdb_processes_all), I also add a tag named process with value all.

Here are some opentsdb put lines with this configuration :

put processes.ps_state.running 1436276102 0 fqdn=myhost.mydomain process=all
put processes.ps_state.sleeping 1436276102 134 fqdn=myhost.mydomain process=all
put processes.ps_state.zombies 1436276102 0 fqdn=myhost.mydomain process=all

put processes.ps_count.threads 1436276102 9 fqdn=myhost.mydomain process=collectd-agent
put processes.ps_count.processes 1436276102 1 fqdn=myhost.mydomain process=collectd-agent
put processes.ps_cputime.user 1436276102 160000 fqdn=myhost.mydomain process=collectd-agent
put processes.ps_cputime.syst 1436276102 310000 fqdn=myhost.mydomain process=collectd-agent

Regards,
Yves

@nathanielc
Copy link

I have compiled and verified that this PR works for me. It works well to convert information from the identifier to tags. I used the config below and I was able to get data from collectd to InfluxDB with host specific tags and good metric names to fit the InfluxDB tagging scheme. I think many others will find this to be useful as it is exactly what I expected to be able to do out of the box.

LoadPlugin "write_tsdb"
LoadPlugin "match_regex"
LoadPlugin "target_set"

<Plugin write_tsdb>
  <Node>
    Host           "influxdb"
    Port           "4242"
    HostTags       "status=production deviceclass=www role=logging"
    StoreRates     false
    AlwaysAppendDS false
  </Node>
</Plugin>

<Chain "PreCache">
   <Rule "opentsdb_tags">
     <Match "regex">
       Plugin ".*"
     </Match>
     <Target "set">
       MetaDataSet "tsdb_tag_pluginInstance" "instance"
       MetaDataSet "tsdb_tag_type" "type"
       MetaDataSet "tsdb_tag_typeInstance" "type_instance"
       MetaDataSet "tsdb_tag_dsname" "dsname"
     </Target>
   </Rule>
</Chain>

@nathanielc
Copy link

@ymettier I am getting tags without values with the above config for certain plugins. For example:

put disk 1437510759 787985 fqdn=collect1 instance=vda type=disk_io_time type_instance= dsname=io_time

Any idea why the tag key is being inserted without the value? Thanks

@nathanielc
Copy link

@ymettier I changed the code locally to check for an empty value as well as an empty key:

In the define TSDB_STRING_APPEND_SPRINTF I added/modified these lines:

const char *v = (value); \
if(k[0] != '\0' && v[0] != '\0') { \
    n = ssnprintf(ptr, remaining_len, " %s=%s", k, v); \ 

This works for me and now tags are only added if they have a value.

@ymettier
Copy link
Contributor Author

Thanks @nathanielc : I updated my code with your fix. See commit 87e08a2.

@ymettier
Copy link
Contributor Author

Hello,

InfluxDB >= 0.9 users, please read !

As described (shortly) in https://influxdb.com/docs/v0.9/write_protocols/opentsdb.html, InfluxDB accepts metrics encoded for OpenTSDB.

So instead of sending "raw" Collectd metrics with the Collectd binary protocol, you can send them with write_tsdb. With this PR, you can easily rewrite the metrics names and add tags, exactly the same as it was planned for OpenTSDB.

In other words, write_tsdb and this PR work very well with InfluxDB >= 0.9 !!!

Regards,
Yves

@nathanielc
Copy link

+1 I have been running collectd patched with this PR and sending metrics to InfluxDB for a few weeks now. With the config I mentioned above I get a measurement per plugin and the rest of the metadata becomes tags. Its been really useful.

@dlmarion
Copy link

dlmarion commented Sep 8, 2015

Running into this issue with collectd -> opentsdb. Thanks for working on this. I'm going to pull it down and give it a try.

@aferrari-technisys
Copy link

+1, some idea when will be merged into Master? this be amazing feature. Thanks to all that works around this.

@ghost
Copy link

ghost commented Sep 10, 2015

Great job! Using it with colltectd/Graphite-API/Grafana/InfluxDB.

Cheers,
Szop

@ghost
Copy link

ghost commented Sep 11, 2015

Hey guys,

I've run into a problem loading my configuration. I've split it up in multiple files and now I'm getting a:

 * Restarting statistics collection and monitoring daemon collectd                                                                                                                                                                                                                                                                                                          Found a configuration for the `network' plugin, but the plugin isn't loaded or didn't register a configuration callback.
Target `set': The `MetaDataSet' configuration option is not understood and will be ignored.
Target `set': The `MetaDataSet' configuration option is not understood and will be ignored.
Target `set': The `MetaDataSet' configuration option is not understood and will be ignored.
Target `set': The `MetaDataSet' configuration option is not understood and will be ignored.
Target `set': You need to set at least one of `Host', `Plugin', `PluginInstance' or `TypeInstance'.
Filter subsystem: Failed to create a set target.

Any idea what this means?

Cheers,
Szop

@cdgraff
Copy link

cdgraff commented Sep 12, 2015

Hi All, I has the same error that reported before @szop85

To check if I miss something, some can share his collect.conf file to use the same and see if I made some mistake?

The version I'm using be: collectd 5.5.0.377.g5afcfb3

Downloaded from:
https://ci.collectd.org/job/pull-requests-prepare-tarball/lastSuccessfulBuild/artifact/collectd-5.5.0.377.g5afcfb3.tar.gz

thanks for any advice!
Ale

@ghost
Copy link

ghost commented Sep 12, 2015

Hey @cdgraff,

can you verify if you have the syslog plugin enabled?

Cheers,
Szop

@cdgraff
Copy link

cdgraff commented Sep 12, 2015

Hi @szop85 no I don't has enabled, just in case i added and is the same error, just now this log into message file and I don't see in the console, but the message is the same.

I'm using the example provided by @nathanielc... if be working for you, can share in some Gist the config file? thanks!

@ghost
Copy link

ghost commented Sep 12, 2015

After splitting my collectd config I forgot to activate/load the syslog plugin. After loading the syslog plugin, everything was fine.

Cheers,
Szop

@ymettier
Copy link
Contributor Author

Target `set': The `MetaDataSet' configuration option is not understood and will be ignored.

@szop85 : this is a feature of my patch. If you do not patch Collectd with PR #1107, Collectd cannot understand MetaDataSet.

Check if you successfully patched your Collectd and if you are really running the patched Collectd.

Regards,
Yves

@cdgraff
Copy link

cdgraff commented Sep 12, 2015

@ymettier can show me a quick how to patch the master branch?

I do this and got and error:
[root@mon collectd]# git apply 1106.patch
[root@mon collectd]# git apply 1107.patch
1107.patch:795: trailing whitespace.

1107.patch:915: trailing whitespace.
while(NULL != (ptr2 = strchr(ptr2, ' '))) ptr2[0] = '_';
1107.patch:1022: trailing whitespace.

1107.patch:1040: trailing whitespace.

error: patch failed: src/collectd.conf.pod:8676
error: src/collectd.conf.pod: patch does not apply
error: patch failed: src/daemon/meta_data.c:106
error: src/daemon/meta_data.c: patch does not apply
error: patch failed: src/daemon/meta_data.h:43
error: src/daemon/meta_data.h: patch does not apply
error: patch failed: src/target_set.c:35
error: src/target_set.c: patch does not apply

I'm not good with git...
Thanks in advance!
Ale

@ymettier
Copy link
Contributor Author

use "git merge" and merge my branch into Collectd master branch.
Then clean.sh, build.sh and the usual compile chain.

@matejzero
Copy link

I'm also testing this pull on my system for the last week and it is working very good. I would love to see this merged to master and in the next release, so I can move my systems to InfluxDB

@kev009
Copy link
Contributor

kev009 commented Dec 16, 2015

Functionally this LGTM

@mfournier mfournier modified the milestone: Features Jan 21, 2016
@Dieken
Copy link

Dieken commented Mar 24, 2016

when will this get merged? it's boring to patch and build collectd myself :-(

@ymettier could you rebase your patch series against branch master or the latest tag "collectd-5.5.1"? Here is how I do it, just a simple conflict in src/write_tsdb.c:

git fetch origin pull/1107/head:1107
git checkout 1107
git rebase --onto collectd-5.5.1 9a8d9ab6     # this commit is the last one followed by your patches

This makes it much easier to merge than your original branch which involves many other upstream commits.

The small conflict needs a one-line patch against collectd-5.5.1 + pull/1107:

diff --git a/src/write_tsdb.c b/src/write_tsdb.c
index c92cd26..aff74e9 100644
--- a/src/write_tsdb.c
+++ b/src/write_tsdb.c
@@ -806,7 +806,7 @@ static int wt_send_message (const char* key, const char* value,

     if (message_len >= sizeof(message)) {
         ERROR("write_tsdb plugin(%s:%s): message buffer too small: "
-              "Need %zu bytes.", node, service, message_len + 1);
+              "Need %d bytes.", node, service, message_len + 1);
         return -1;
     }

If rebase against branch "master", there is also a small conflict,

--- a/src/write_tsdb.c
+++ b/src/write_tsdb.c
@@@ -345,9 -522,20 +520,24 @@@ static int wt_format_name(char *ret, in
                            const char *ds_name)
  {
      int status;
+     int i;
      char *temp = NULL;
++<<<<<<< 887a9b3cf0b65f1118e71001985b456cb0aa5622
 +    const char *prefix = "";
++=======
+     char *prefix = NULL;
++>>>>>>> write_tsdb uses meta data to reformat metrics and set tags

This should be resolved by choosing the bottom half.


if (vl->meta) {
TSDB_META_DATA_GET_STRING(meta_tag[TSDB_TAG_PLUGIN]);
if(temp) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space after if

@kev009
Copy link
Contributor

kev009 commented Apr 6, 2016

Review fixes in #1655

@danoyoung
Copy link

Any ETA on this? We would like to see this feature as well

@mfournier mfournier modified the milestones: Features, 5.6 Aug 2, 2016
@octo
Copy link
Member

octo commented Aug 11, 2016

Superseeded by #1655

@octo octo closed this Aug 11, 2016
@fenggolang
Copy link

fenggolang commented Dec 30, 2016

@ymettier My /etc/collectd.conf
LoadPlugin interface
image

LoadPlugin match_regex
LoadPlugin target_set
image

And systemctl restart collectd
tail -f /var/log/collectd.log
image

thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet