Database Reference
In-Depth Information
Example 3-35. aggregate() in Python
sumCount
=
nums
.
aggregate
((
0
,
0
),
(
lambda
acc
,
value
:
(
acc
[
0
]
+
value
,
acc
[
1
]
+
1
),
(
lambda
acc1
,
acc2
:
(
acc1
[
0
]
+
acc2
[
0
],
acc1
[
1
]
+
acc2
[
1
]))))
return
sumCount
[
0
]
/
float
(
sumCount
[
1
])
Example 3-36. aggregate() in Scala
val
result
=
input
.
aggregate
((
0
,
0
))(
(
acc
,
value
)
=>
(
acc
.
_1
+
value
,
acc
.
_2
+
1
),
(
acc1
,
acc2
)
=>
(
acc1
.
_1
+
acc2
.
_1
,
acc1
.
_2
+
acc2
.
_2
))
val
avg
=
result
.
_1
/
result
.
_2
.
toDouble
Example 3-37. aggregate() in Java
class
AvgCount
implements
Serializable
{
public
AvgCount
(
int
total
,
int
num
)
{
this
.
total
=
total
;
this
.
num
=
num
;
}
public
int
total
;
public
int
num
;
public
double
avg
()
{
return
total
/
(
double
)
num
;
}
}
Function2
<
AvgCount
,
Integer
,
AvgCount
>
addAndCount
=
new
Function2
<
AvgCount
,
Integer
,
AvgCount
>()
{
public
AvgCount
call
(
AvgCount
a
,
Integer
x
)
{
a
.
total
+=
x
;
a
.
num
+=
1
;
return
a
;
}
};
Function2
<
AvgCount
,
AvgCount
,
AvgCount
>
combine
=
new
Function2
<
AvgCount
,
AvgCount
,
AvgCount
>()
{
public
AvgCount
call
(
AvgCount
a
,
AvgCount
b
)
{
a
.
total
+=
b
.
total
;
a
.
num
+=
b
.
num
;
return
a
;
}
};
AvgCount
initial
=
new
AvgCount
(
0
,
0
);
AvgCount
result
=
rdd
.
aggregate
(
initial
,
addAndCount
,
combine
);
System
.
out
.
println
(
result
.
avg
());
Some actions on RDDs return some or all of the data to our driver program in the
form of a regular collection or value.