Secondary Indexes are used to index Document keys and parts of JSON values, and Reductions create aggregate values on that Index
As part of the View is a Reduction that can also take place. When you map() documents and index them, you can group Index-Keys together and produce aggregate information as a "Reduction". This allows for the creation of statistics on a collection of documents and other creative uses of reduce. Continuing with our data set from Anatomy of View let's add another document data point, and a reduction to see how it works.
key: user::2a92jd02828
{
doctype: "user",
name: "Aldon Smith",
username: "rudeboy",
email: "asmith@email.com",
points: 1000,
last_login: 1360003359
}
key: user::828c201abf
{
doctype: "user",
name: "Byron Smith",
username: "metallicafan",
email: "bsmith@email.com",
points: 2000,
last_login: 1360002838
}
key: user::3f28f2929d
{
doctype: "user",
name: "Calvin Smith",
username: "bono2830",
email: "csmith@email.com",
points: 3000,
last_login: 1360001292
}
key: user::52a8289df
{
doctype: "user",
name: "Dexter Smith",
username: "thedex",
points: 0,
last_login: 1360002939
}
function(doc, meta) {
// Ensure we are processing only user documents and that the index field exists, other documents are ignored
if (doc.doctype == "user" && doc.username) {
// Index the username field, output the email
emit(doc.username, doc.email)
}
}
_count
In this scenario we are also reducing the results of the map function to get a count of the number of Index-Keys in the Index. We are querying with reduce=true in the query parameters:
| Key | Value | Document Key |
|---|---|---|
| null | 4 | undefined |
Notice when we do the reduction, we are collapsing rows in our index and there is no associated row key nor document key for the result (since there are many, potentially millions of course if you have a large dataset).
What if we wanted to see all users ordered by the number of points they have, and also see stats on the highest, lowest, and average points?
function(doc, meta) {
// Ensure we are processing only user documents and that the index field exists, other documents are ignored
if (doc.doctype == "user" && doc.points) {
// Index the points field and output the username
emit(doc.username, doc.points)
}
}
| # | Index-Key | Output Value | Document Key |
|---|---|---|---|
| 1 | "bono2830" | 3000 | user::3f28f2929d |
| 2 | "metallicafan" | 2000 | user::828c201abf |
| 3 | "rudeboy" | 1000 | user::2a92jd02828 |
If we query this View with reduce=false you only look at results of the map() function. In this case, user "thedex" is not in the index with points: 0 because in the if statement points == 0, and 0, false and null are logical false in javascript.
Notice that the order is backwards as far as points. If we want to create a leaderboard of highest points, we have some things to consider. First the _stats and _sum reduce functions can only work on numbers. Let's first see the result and then we can write another View that functions as a simple leaderboard, including people with 0 points.
_stats and _sum built-in reduce functions can only work output values that are numerical, you will receive an error if you try to use them on strings
_count built-in reduce function can be used with any output values
_stats
| Key | Value | Document Key |
|---|---|---|
| null | {"sum":6000, "count":3, "min":1000, "max":3000, "sumsqr":14000000} | undefined |
From the summary data you can calculate average (2000 in this case) by using the sum and count, and you can also see the min/max. The sum of squares can be used for statistical calculations.
Due to the use of the _stats reduce above, we have to use the points value in the output of the emit() function. If we want a Leaderboard however we need to index the points themselves. In this case we have two options, we can index the points and output the points, which allows for the _stats reduction when we want summary info. We then have to use a get() on the row key to retrieve the document information (username).
function(doc, meta) {
// Ensure we are processing only user documents and that the index field exists, other documents are ignored
if (doc.doctype == "user" && doc.points >= 0 ) {
// Index the points field and output the username
emit(doc.points, doc.points)
}
}
| # | Index-Key | Output Value | Document Key |
|---|---|---|---|
| 1 | 0 | 0 | user::52a8289df |
| 2 | 1000 | 1000 | user::2a92jd02828 |
| 3 | 2000 | 2000 | user::828c201abf |
| 4 | 3000 | 3000 | user::3f28f2929d |
function(doc, meta) {
// Ensure we are processing only user documents and that the index field exists, other documents are ignored
if (doc.doctype == "user" && doc.points >= 0 ) {
// Index the points field and output the username
emit(doc.points, doc.username)
}
}
| # | Index-Key | Output Value | Document Key |
|---|---|---|---|
| 1 | 0 | "thedex" | user::52a8289df |
| 2 | 1000 | "rudeboy" | user::2a92jd02828 |
| 3 | 2000 | "metallicafan" | user::828c201abf |
| 4 | 3000 | "bono2830" | user::3f28f2929d |
There are many different ways to use reductions, and you can write your own custom reducers, however, be aware they can increase latency if you don't use the built-in one's as it requires more CPU to compute custom reduces. See the Map-Reduce Examples for more details of additional reduces and a sample custom reducer.