Doing Analytics in Ruby.
Sleek is a gem for doing analytics. It allows you to easily collect and
analyze events that happen in your app.
Sleek is a work-in-progress development. Use with caution.
The easiest way to install Sleek is to add it to your Gemfile:
gem "sleek"
Or, if you want the latest hotness:
gem "sleek", github: "goshakkk/sleek"
Then, install it:
$ bundle install
Sleek requires MongoDB to work and assumes that you have Mongoid
configured already.
Finally, create needed indexes:
$ rake db:mongoid:create_indexes
Namespaces are a great way to organize entirely different buckets of
data inside a single application. In Sleek, everything is namespaced.
Creating a namespaced instance of Sleek is easy:
sleek = Sleek[:my_namespace]
You then would just call everything on this instance.
The heart of analytics is in recording events. Events are things that
happen in your app that you want to track. Events are stored in event
buckets.
In order to send an event, you would simply need to call
sleek.record
, passing the event bucket name and the event
payload.
sleek.record(:purchases, {
customer: { id: 1, name: "First Last", email: "[email protected]" },
items: [{ sku: "TSTITM1", name: "Test Item 1", price: 1999 }],
total: 1999
})
There are a few methods of analyzing your data. The simplest one is
counting. It, you guessed it, would count how many times the event has
occurred.
sleek.queries.count(:purchases)
# => 42
In order to calculate average value, it’s needed to additionally specify
what property should the average be calculated based on:
sleek.queries.average(:purchases, target_property: :total)
# => 1999
You can limit the scope of events that analysis is run on by adding the
:timeframe
option to any query call.
sleek.queries.count(:purchases, timeframe: :this_day)
# => 10
Some kinds of applications may need to analyze trends in the data. Using
intervals, you can break a timeframe into minutes, hours, days, weeks,
or months. One can do so by passing the :interval
option to any query
call. Using :interval
also requires that you specify :timeframe
.
sleek.queries.count(:purchases, timeframe: :this_2_days, interval: :daily)
# => [
# {:timeframe=>2013-01-01 00:00:00 UTC..2013-01-02 00:00:00 UTC, :value=>10},
# {:timeframe=>2013-01-02 00:00:00 UTC..2013-01-03 00:00:00 UTC, :value=>24}
# ]
The word “metrics” is used to describe analysis queries which return a
single numeric value.
Count just counts the number of events recorded.
sleek.queries.count(:bucket)
# => 42
It counts how many events have an unique value for a given property.
sleek.queries.count_unique(:bucket, params)
You must pass the target property name in params like this:
sleek.queries.count_unique(:purchases, target_property: "customer.id")
# => 30
It finds the minimum numeric value for a given property. All non-numeric
values are ignored. If none of property values are numeric, nil will
be returned.
sleek.queries.minimum(:bucket, params)
You must pass the target property name in params like this:
sleek.queries.minimum(:purchases, target_property: "total")
# => 10_99
It finds the maximum numeric value for a given property. All non-numeric
values are ignored. If none of property values are numeric, nill will
be returned.
sleek.queries.maximum(:bucket, params)
You must pass the target property name in params like this:
sleek.queries.maximum(:purchases, target_property: "total")
# => 199_99
The average query finds the average value for a given property. All
non-numeric values are ignored. If none of property values are numeric,
nil will be returned.
sleek.queries.average(:bucket, params)
You must pass the target property name in params like this:
sleek.queries.average(:purchases, target_property: "total")
# => 49_35
The sum query sums all the numeric values for a given property. All
non-numeric values are ignored. If none of property values are numeric,
nil will be returned.
sleek.queries.sum(:bucket, params)
You must pass the target property name in params like this:
sleek.queries.sum(:purchases, target_property: "total")
# => 2_072_70
Series allow you to analyze trends in metrics over time. They break a
timeframe into intervals and compute the metric for those intervals.
Calculating series is simply done by adding the :timeframe
and
:interval
options to the metric query.
Valid intervals are:
:hourly
:daily
:weekly
:monthly
In addition to using metrics and series, it is sometimes desired to
group their outputs by a specific property value.
For example, you might be wondering, “How much have me made from each of
our customers?” Group by will help you answer questions like this.
To group metrics or series result by value of some property, all you
need to do is to pass the :group_by
option to the query.
sleek.queries.sum(:purchases, target_property: "total", group_by: "customer.email")
# => {"[email protected]"=>214998, "[email protected]"=>64999}
Or, you may wonder how much did you make from each of your customers for
every day of this week.
sleek.queries.sum(:purchases, target_property: "total", timeframe: :this_week,
interval: :daily, group_by: "customer.email")
You can even combine it with filters. For example, how much did you make
from each of your customers for evey day of this weeks on orders greater
than $1000?
sleek.queries.sum(:purchases, target_property: "total", filter: ["total", :gte, 1000_00],
timeframe: :this_week, interval: :daily, group_by: "customer.email")
To limit the scope of events used in analysis you can use a filter. To
do so, you just pass the :filter
option to the query.
A single filter is a 3-element array, consisting of:
property_name
- the property name to filter.operator
- the name of the operator to apply.value
- the value used in operator to compare to property value.Operators: eq, ne, lt, lte, gt, gte, in.
You can pass either a single filter or an array of filters.
sleek.queries.count(:purchases, filters: [:total, :gt, 1599])
# => 20
You can pass the :timeframe
with or without :timezone
to any query.
Timeframe is used to limit your query by some window of time. You can
use a range of TimeWithRange
objects to specify absolute timeframe, or
you can use a string that describes relative timeframe.
Relative timeframe string (or a symbol) consists of these parts: category,
optional number, and interval specification. Possible categories are this
and previous
, possible intervals are minute
, hour
, day
, week
,
month
.
Examples: this_day
, previous_3_weeks
.
By default, relative times are transformed into ranges of time objects
in UTC timezone. You can, however, pass the :timezone
option to tell
Sleek to construct the window of time in the given timezone.
Refer to ActiveSupport::TimeZone
docs
for more details on possible timezone identifiers.
sleek.delete!
sleek.delete_bucket(:purchases)
sleek.delete_property(:purchases, :some_property)
MIT.