Skip to content

Logstash spikes cpu @100% on grok parse failure #37

Closed
@jordansissel

Description

@jordansissel
Contributor

(This issue was originally filed by @sujanks at elastic/logstash#2619)


Hi,

We have logstash 1.4.2 agents running consuming logs from SQS on ElasticSearch 1.4.2.

Every time we run the box sometimes it lasts for 2 mins to 5 mins then the cpu spikes to 100%, no matter how big boxes (EC2: m3Xlarge).

After spending lot of time, found out the it is due to the grok, but not clear why?

Following is our grok in the config file.

grok {
      match => [ "Message", "%{DATE:date} %{GREEDYDATA:time} \[%{GREEDYDATA:cloudhubapp}\] %{DATA:loglevel} %{DATA:application}\.%{DATA:component}\.%{DATA:subcomponent}\.%{DATA:position} - %{GREEDYDATA:keyvalue} Message=%{GREEDYDATA:message}" ]
}

These are the sample log message

2015-02-11 05:03:01,209 [[digital-methode-subscriber].connector.http.mule.default.receiver.2104] INFO  apache.component.content-status-notification-publisher-http-handler.other - transactionID=160246_8aba3078-b1a2-11e4-90ab-ef3fd79aaa94 Message=Content Status Notification message successfully.

2015-02-13 03:09:07,813 [[digital-methode-subscriber].connector.http.mule.default.receiver.31] ERROR org.apache.retry.notifiers.ConnectNotifier - Failed to connectreconnect: Work Descriptor. Root Exception was: One or more parameters are invalid. Reason: Message must be shorter than 262144 bytes.. Type: class com.amazonaws.AmazonServiceException

In the grok if I remove after hyphen (-), the logstash is ok and run on cpu about 20%, on EC2 m1.large.

Removed part

- %{GREEDYDATA:keyvalue} Message=%{GREEDYDATA:message}"

Any idea?

Sujan

Activity

jordansissel

jordansissel commented on Aug 7, 2015

@jordansissel
ContributorAuthor

The problem is likely that the use of GREEDYDATA doesn't inform the execution about how to match your data efficiently. You can read more about this kind of phenomenon on what Wikipedia calls ReDoS. Basically, my recommendation is to use the most specific patterns you can, and GREEDYDATA is not very specific (It will match anything or nothing), and such ambiguity can cause the regular expression engine to get bogged down trying to match things.

jordansissel

jordansissel commented on Aug 7, 2015

@jordansissel
ContributorAuthor

I do believe this to be an example of a pathological regexp that causes your parsing to be so slow or simply appear to be doing nothing but consuming 100% cpu.

jsvd

jsvd commented on Sep 15, 2016

@jsvd
Member

closing this.
two notes about this issue in particular:

  1. having two consecutive %{GREEDYDATA} patterns makes no sense. if your matching string is something like Once upon a time, there was a mouse., then the first GREEDYDATA will consume "Once upon a time, there was a", and the second will be "mouse." There's no way for the regex engine to figure out where to start and stop each GREEDYDATA, so it gives as much as possible to the first one.
  2. adding guards to the regex makes parse failure much faster (2x to 5x in this case).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @jsvd@jordansissel

        Issue actions

          Logstash spikes cpu @100% on grok parse failure · Issue #37 · logstash-plugins/logstash-filter-grok