Skip to content

took time value is much larger than that calculated by Profile API #29275

Closed
@owenericsson

Description

@owenericsson

Describe the feature: whole time calculation is wrong with took time value

Elasticsearch version (bin/elasticsearch --version): 5.3.2

Plugins installed: []

JVM version (java -version): JDK 1.8 update 132

OS version (uname -a if on a Unix-like system): CentOS 5.5

Description of the problem including expected versus actual behavior:
i would like to diagnostic for a query with a term aggregation, and then i added a profile:true to the query. however i see the time took 500+ms while i found out that the maximum time of a shard is just ~10ms.
would you some one please give me some hint that where the 400+ms take place

Provide logs (if relevant):
see attachment, for security reason i skipped the true result and renamed the index name and field name.

result.txt

Activity

bleskes

bleskes commented on Mar 28, 2018

@bleskes
Contributor

The profile api currently only profiles that query and aggregation parts. It doesn't measure the fetch phase nor the network overhead. I suggest you look at those.

IMO I think there are two things to do here:

  1. Potentially add a profiler for the fetch phase.
  2. Document the above to avoid confusion.

@owenericsson if you have any more questions please open up a thread on discuss.elastic.co and we can dive deeper. We keep github for bugs and features requests.

added
:Search/SearchSearch-related issues that do not fall into other categories
on Mar 28, 2018
elasticmachine

elasticmachine commented on Mar 28, 2018

@elasticmachine
Collaborator

Pinging @elastic/es-search-aggs

owenericsson

owenericsson commented on Mar 28, 2018

@owenericsson
Author

hi @bleskes ,
thanks for the reply, and also apologize for posting in the wrong place.

back to this problem

  1. i have set the size to 0, so there shouldn't be the fetch overhead.
  2. when i remove the "aggs" part, the time cost back to a normal level (< 100ms). So i am sure that the overhead must be from the aggs

BTW. the field type of the term aggregation i used is 'keyword', and do you think this overhead is from the GlobalOrdinals ? anyway i think it is very unlikely this overhead is from the network.

bleskes

bleskes commented on Mar 28, 2018

@bleskes
Contributor

OK, that changes things (although my comments are still valid in general). I will defer for someone from the search area to pick this up.

jpountz

jpountz commented on Mar 28, 2018

@jpountz
Contributor

Most likely most execution time is taken by something which is not measured by the profile, like the time it took to build global ordinals, time spent in the queue, time spent merging shard responses on the coordinating node, etc. I agree we shoud at least document this.

owenericsson

owenericsson commented on Mar 28, 2018

@owenericsson
Author

one more thing need attention
I created my index with 11 shards, by using a routing key which means for the Search above, only one shard owns all the result data.

added a commit that references this issue on Dec 5, 2018
cbuescher

cbuescher commented on Dec 5, 2018

@cbuescher
Member

Closing this with adding documentation in #36252, some other improvements to the way the Profile API works are tracked in #21073

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Search/SearchSearch-related issues that do not fall into other categories>docsGeneral docs changes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @jpountz@bleskes@owenericsson@cbuescher@elasticmachine

        Issue actions

          took time value is much larger than that calculated by Profile API · Issue #29275 · elastic/elasticsearch