一区二区日本_久久久久久久国产精品_无码国模国产在线观看_久久99深爱久久99精品_亚洲一区二区三区四区五区午夜_日本在线观看一区二区

MLlib is Apache Spark's scalable machine learning library.

Ease of use

Usable in Java, Scala, Python, and R.

MLlib fits into Spark's APIs and interoperates with NumPy in Python (as of Spark 0.9) and R libraries (as of Spark 1.5). You can use any Hadoop data source (e.g. HDFS, HBase, or local files), making it easy to plug into Hadoop workflows.

data = spark.read.format("libsvm")\
  .load("hdfs://...")

model = KMeans(k=10).fit(data)
Calling MLlib in Python

Performance

High-quality algorithms, 100x faster than MapReduce.

Spark excels at iterative computation, enabling MLlib to run fast. At the same time, we care about algorithmic performance: MLlib contains high-quality algorithms that leverage iteration, and can yield better results than the one-pass approximations sometimes used on MapReduce.

Logistic regression in Hadoop and Spark

Runs everywhere

Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud, against diverse data sources.

You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

Algorithms

MLlib contains many algorithms and utilities.

ML algorithms include:

  • Classification: logistic regression, naive Bayes,...
  • Regression: generalized linear regression, survival regression,...
  • Decision trees, random forests, and gradient-boosted trees
  • Recommendation: alternating least squares (ALS)
  • Clustering: K-means, Gaussian mixtures (GMMs),...
  • Topic modeling: latent Dirichlet allocation (LDA)
  • Frequent itemsets, association rules, and sequential pattern mining

ML workflow utilities include:

  • Feature transformations: standardization, normalization, hashing,...
  • ML Pipeline construction
  • Model evaluation and hyper-parameter tuning
  • ML persistence: saving and loading models and Pipelines

Other utilities include:

  • Distributed linear algebra: SVD, PCA,...
  • Statistics: summary statistics, hypothesis testing,...

Refer to the MLlib guide for usage examples.

Community

MLlib is developed as part of the Apache Spark project. It thus gets tested and updated with each Spark release.

If you have questions about the library, ask on the Spark mailing lists.

MLlib is still a rapidly growing project and welcomes contributions. If you'd like to submit an algorithm to MLlib, read how to contribute to Spark and send us a patch!

Getting started

To get started with MLlib:

  • Download Spark. MLlib is included as a module.
  • Read the MLlib guide, which includes various usage examples.
  • Learn how to deploy Spark on a cluster if you'd like to run in distributed mode. You can also run locally on a multicore machine without any setup.
主站蜘蛛池模板: 国产一区二区精品在线 | 亚洲一区视频在线 | 日韩第一夜 | 在线看91| 91电影在线 | 欧美日本在线观看 | 欧美激情一区 | 91久久久久久久久久久久久 | 男女一区二区三区 | 国产精品国产自产拍高清 | 国产一区三区在线 | 一区二区三区免费 | 国产精品一区二区三级 | 久久国产欧美一区二区三区精品 | 午夜免费观看网站 | 日本一区二区电影 | 欧美日韩综合一区 | 亚洲一区在线日韩在线深爱 | 一区二区三区四区电影 | 亚洲精品久久久一区二区三区 | 日韩有码在线观看 | h视频免费看 | 国产精品久久久久久久久久久久 | 99精品视频在线观看免费播放 | 91免费在线 | 久久99精品久久久久久 | 四虎影院一区二区 | 欧美伊人影院 | 国产精品毛片一区二区三区 | 国产精品一区二区在线 | 91精品国产91久久久久久 | 999久久久久久久久6666 | 国产色网站 | 久久免费电影 | 国产成人精品一区二 | 国产成人精品一区二区三区 | 亚洲www啪成人一区二区麻豆 | 日韩在线视频免费观看 | 久久久高清 | 99成人| 欧美色综合天天久久综合精品 |