Database Reference
In-Depth Information
Let us now look at an example of MADlib function implementation for linear regres-
sion.
As we have learned in Chapter 3 , Advanced Analytics - Paradigms, Tools, and Tech-
niques , linear regression is a statistical technique that helps fit data into a linear
equation.
The MADlib prediction function that we would be using for this purpose is as shown:
linregr_predict(
coeficient,
col_ind
)
Following are the steps to implement and run MADlib functions in Greenplum:
1. Create the dataset for running regression function:
CREATE TABLE items (id INT, tax INT,
quantity INT, price INT,
size INT);
COPY items FROM STDIN WITH DELIMITER '|';
59 | 2 | 1 | 500 | 770
105 | 3 | 2 | 850 | 1410
2 | 3 | 1 | 225 | 1060
87 | 2 | 2 | 900 | 1300
132 | 3 | 2 | 1330 | 1500
135 | 2 | 1 | 905 | 820
279 | 3 | 2.5 | 2600 | 2130
68 | 2 | 1 | 1425 | 1170
184 | 3 | 2 | 1600 | 1500
368 | 4 | 2 | 2400 | 2790
166 | 3 | 1 | 870 | 1030
162 | 3 | 2 | 1186 | 1250
310 | 3 | 2 | 1400 | 1760
207 | 2 | 3 | 1480 | 1550
65 | 3 | 1.5 | 650 | 1450
2. Build a regression model:
Search WWH ::




Custom Search