Fintech company is a place where you can borrow and lend a money. The power to send is limitless. As one of Forbes Articles described, “Fintech companies, as they’ve come to be called, are easing payment processes, reducing fraud, saving users money, promoting financial planning, and ultimately moving a giant industry forward.” When talking about fintech companies, one that comes to mind is Prosper. In this blog, I will use their data to perform the analysis.

There are 113937 loans and 81 features in the dataset. I only use 13 among them since all of the features will make this blog too long. I rename the columns to make it short, and drop duplicate data so 1 observations consist of 1 person.

Note that this blog is an exploratory data analysis. This will perform grid search analysis which might be too heavy for some public reader. Nonetheless it’s contain some useful information and finding which is useful. Specifically, those that contains in the “analysis” may provides some useful insights.

Due to many variables in this dataset, I only use 12 variables. Those are:

Univariate Plot Section

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.00000  0.00000  0.00000  0.04228  0.00000 39.00000
## 
##     0     1     2     3     4     5     6     7     8     9    14    16 
## 87862  2438   403    78    23     8     3     3     2     5     1     2 
##    21    24    39 
##     1     1     1

Recommendations: Number of recommendations.

This is the number of recommendations that Borrowers have when listing the loans. We can see that the number of the recommendations is right skewed. With the mean 0.04, average borrowers don’t have many recommendations. And it goes as high as 39 recommendations for just one borrower.

Histogram chose to depict the distribution of the numerical variable. Since this is right skewed, I log scale the number of recommendations.

Rate: The Borrower’s interest rate for this loan.

The borrower’s rate follow an almost unimodal distribution, with the peak around 0.16. There’s small spike occurs around 0.3.

## 
##     1     2     3     4     5     6     7 
##  5487  7521 10839 15224 12633 11081  4151

Rating : The Prosper Rating assigned at the time the listing was created between AA - HR. Applicable for loans originated after July 2009.

The Rating can also be null if the Prosper system can’t rate the loan. About 29084 loans aren’t rated by Prosper, which means that the loans originated before July 2009. The number of rating almost follow the order of the rating, except A-grade is the highest number of rating, AA comes second, and the rest following the order.

## 
##     0     1     2     3     4     5     6     7     8     9    10    11 
## 14398 47562  5242  5406  1851   561  1738  7903   150    64    76   179 
##    12    13    14    15    16    17    18    19    20 
##    44  1602   746  1117   253    41   669   597   632

Category: The category of the listing that the borrower selected when posting their listing: 0 - Not Available, 1 - Debt Consolidation, 2 - Home Improvement, 3 - Business, 4 - Personal Loan, 5 - Student Use, 6 - Auto, 7- Other, 8 - Baby&Adoption, 9 - Boat, 10 - Cosmetic Procedure, 11 - Engagement Ring, 12 - Green Loans, 13 - Household Expenses, 14 - Large Purchases, 15 - Medical/Dental, 16 - Motorcycle, 17 - RV, 18 - Taxes, 19 - Vacation, 20 - Wedding Loans

I choose Bar Chart for this Listing Category since this is categorical variable. Out of the listing category, three categories comes out as the highest past 10.000 loans. There’s N/A and Other categories, so we can’t know for sure the specific category. But one comes out highest which is category 1 (a.k.a. Debt Consolidation), where’s one take out a loan to pay many others. This comes really high with 58308 loans, overshadowing the rest of the categories. It could be that many Prosper visitors comes with already have loans, and want to search some loans to pay for it.

## False  True 
## 45292 45539

A Borrower will be classified as a homeowner if they have a mortgage on their credit profile or provide documentation confirming they are a homeowner. When looking at the loan that’s been listed, we see that borrower that is homeowner is around equal proportion to those who do not. So it’s not affecting much.

Income is the borrower stated at the time the listing was created. Still we see that isn’t much going on with the monthly income. Monthly Income will definitely be right skewed since fewer people will have a higher salary. So I cut the outliers and log 10 scales.