A great new addition to Log Insight 3.3 is the introduction of a query API. While the initial documentation for the feature has not been posted yet, it is in progress and should be available soon. In the meantime, I have included the latest information so you can start leveraging the API today. Read on to learn more!

Log Insight for vCenter

Credit

First and foremost, shout out to Nick Kushmerick for putting together this documentation — this post is all him!

Background

The Log Insight Query API allows end-users to programmatically query Log Insight to retrieve events, and aggregations over events. The API:

  • Exposes simple event and aggregated-event queries as HTTP GETs:
    1. Events
      GET /api/v1/events/constraint1/constraint2/…?param1&param2&...
    2. Aggregated events
      GET /api/v1/aggregated-events/constraint1/constraint2/…?param1&param2&...
  • Allows structured queries against both static fields, and dynamic fields defined in content packs.
  • Offers several standard aggregation functions (COUNT, UCOUNT, AVG, MIN, MAX, SUM, STDEV, VARIANCE, SAMPLE) on both static fields, and dynamic fields defined in content packs.
  • Allows aggregating events by time into fixed-width bins.
  • Defaults to a simple & fast queries:
    1. Events: up to 100 events from the last minute, with a 30 seconds timeout
    2. Aggregated events:  as above, with 5 second time bins and the COUNT aggregation function

NOTE: The query API today has the same limits as the UI in terms of returned results.

Authentication

The Log Insight Query API requires authentication, and Log Insight denies requests from non-authorized users.  Specifically, the Query API requires authentication by a user with at least the “User” role. Before invoking the Query API, your client must first authenticate and obtain a session id by POSTing to /api/v1/sessions, and then send the session id with the special header X-LI-Session-Id in subsequent requests to the Query API.  For example:

$ curl -sk -X POST -H 'Content-Type: application/json' --data "{"username":"________", "password":"_______"}" https://_________/api/v1/sessions
{"userId":"8a30e36a-d525-48bb-aaa8-9deec8ba21f1","sessionId":"2pj5iYL....nQB6c=","ttl":1800}
$ curl -sk -H 'Authorization: Bearer 2pj5iYL....nQB6c=' https://_________/api/v1/events
{"complete":true, … }

More information about the Authentication API can be found in my previous post here.

Specification

GET /api/v1/events/path&query

  • URL path and query – see details below
  • Request payload: none
  • Response:
    • Success:
      200 OK

      • Payload:
        {
            "events": [ event1, event2, … ],
            "complete": {true|false}
        }

      Where

      • ‘complete’ indicates whether the query result was fully computed before the timeout expired (true), or partial results are returned because the timeout expired (false).
      • Event:
        {
            "text": "original event text",
            "timestamp": 1234567890,
            "fields": [ field1, field2, … ]
        }
      • Field:  there are two formats for a field, depending on whether its value (a) does not exist in the event itself, or (b) is a substring of the original event :
        (a)

        {
            "name": "myfield",
            "content”: content
        }

        (b)

        {
            "name": "myfield",
            "startPosition": 47,
            "length": 18
        }
      • Content: a number (123.45) or “quoted string”
    • Failure:
      • 401 Unauthorized: the request is not authenticated or the user does not have the “User” role, or the simple query API is disabled
      • 400 Bad Request: the constraints are invalid (eg an invalid operator).

GET /api/v1/aggregated-events/path&query

  • URL path and query – see details below
  • Request payload: none

Response:

  • Success:
    200 OK

    • Payload:
      {
          "bins": [ bin1, bin2, … ],
          "complete": {true|false}
      }

    Where

    • ‘complete’ indicates whether the query result was fully computed before the timeout expired (true), or partial results are returned because the timeout expired (false).
    • Bin:
      {
          "min-timestamp": 1234567000,
          "max-timestamp": 1234567999,
          "value": value
      }
    • Value: a number (123.45) for numeric aggregation functions, or an event for the SAMPLE aggregation function.
  • Failure:
    • 401 Unauthorized: the request is not authenticated or the user does not have the “User” role, or the simple query API is disabled
    • 400 Bad Request: the constraints are invalid (eg an invalid operator)

Constraints in the URL path and query: constraint1/constraint2/…?key1=value1&key2=value2&…

  • URI path after /api/v1/events = zero or more constraints separated by “/”, optionally followed by “?” and then one or more key=value pairs separated by “?”
  • Constraint = one of…
    • “field/operator value”
      • Field
        • The text or timestamp magic fields
        • Any static field
        • A field defined in a content pack, referenced with the syntax content_pack_namespace.field_name (eg com.vmware.vsphere:vmw_user or com.lenovo.xclarity:lenovo_lxca_class).
      • Operator
        • Numeric operators
          • EQ (=), NE (!=), LE (<=), LT (<), GE (>=), GT (>)
        • String operators:
          • CONTAINS and NOT_CONTAINS
          • MATCHES_REGEX (=~) and NOT_MATCHES_REGEX (!=~)
        • Whitespace is optional with the terse form, whitespace is mandatory with the verbose form
        • There are no explicit STARTS_WITH, NOT_START_WITH operators, but this can be achieved with a trailing * ; for example, text/CONTAINS foo* retrieves events containing “foo”, “foobar”, “foobaz”, etc.
      • Value
        • Must be numeric for numeric operators.
    • field/EXISTS
  • Phrases:
    • text/CONTAINS foo bar retrieves events that contain the phrase “foo bar” (perhaps separated by punctuation).
    • text/CONTAINS bar foo  retrieves events that contain the phrase in the opposite order.
    • text/CONTAINS foo/text/CONTAINS bar retrieves events that contain foo or bar in either order but not necessarily both.
    • text/=~foo.*bar/text/=~bar.*foo retrieves events that contain both foo and bar in either order.
  • key=value pairs
    • limit=10 — maximum number of events to retrieve (limit must be at most 20,000 for event queries and 2,000 for aggregation queries)
    • timeout=60000 — number of milliseconds to wait for response, if the exact result is not available then the response will be a partial result with “complete=false”
  • Default URI path: timestamp/>T?limit=100&timeout=30000
    where T = 1 minute ago
  • Everything must be URL-encoded — eg, for “/api/v1/foo/> 10” the actual URL must be “/api/v1/events/foo/%3E%2010” or “/api/v1/events/foo/%3E+10”
  • Detailed example:
    GET /api/v1/events/text/foobar/filepath/!bifbuz/build_number/> 12345/text/=~[A-Z]*/java_class/a/java_class/b

    li-33-query-api-foobar-example

AND/OR/NOT and duplicated field/operator combinations

Arbitrary AND/OR/NOT constraint trees cannot be expressed with the Query API today.  For complex queries, this may require the client to submit multiple requests and merge the results on the client side. In general, constraints are ANDed: text/CONTAINS foo/size/>10 retrieves events that both contain “foo” and that have a size field greater than 10. However, if there are more than one constraint for a given field and operator, then the constraints are ORed.  For example: text/CONTAINS foo/text/CONTAINS bar/size/>10 retrieves events with size field greater than 10, and that contain either “foo” or “bar”. Arbitrary negation is not supported.  However, there is a negated version of each operator — for example, CONTAINS and NOT_CONTAINS, LT and GE, etc. This is the same behavior as the UI.

URL syntax — /api/v1/aggregated-events/constraint1/constraint2/…&key1=value1&key2=value2&…

Same options as for /api/v1/events, with seven additional key=value options:

  • bin-width=2000 — width in milliseconds of the time-range bins (default 5 seconds)
  • aggregation-function=AVG — the aggregation function:
    • COUNT — aggregate by counting the events in each bin (this is the default)
    • SAMPLE — aggregate by returning an arbitrary event from each bin
    • UCOUNT, MIN, MAX, SUM, STDEV, VARIANCE  — aggregate events using the given aggregation function on the field specified by aggregation-field
  • aggregation-field=size — the field to be aggregated.  Not permitted for COUNT, SAMPLE; mandatory for all other aggregation functions.
  • Example:
    /api/v1/aggregated-events?bin-width=1000&aggregation-function=UCOUNT&aggregation-field=appname

    li-33-query-api-url-example

Java Example

To tie this all together, how about an example Java class you can leverage? Good news, one is available here!

Summary

As you can see, the query API provides another means to get search results out of Log Insight. While any queries that can be done in Log Insight should be done in Log Insight, the query API allows you to manipulate, store, and display the events any way you wish. Have feedback on the new query API? Be sure to post on https://loginsight.vmware.com!