{"id":1156,"date":"2023-04-13T14:56:24","date_gmt":"2023-04-13T05:56:24","guid":{"rendered":"http:\/\/edu.ujhb.org\/?p=1156"},"modified":"2023-04-13T14:56:24","modified_gmt":"2023-04-13T05:56:24","slug":"spark%e5%85%a5%e9%96%80%ef%bc%88%e6%97%a5%e6%9c%ac%e8%aa%9e%e7%89%88%ef%bc%89","status":"publish","type":"post","link":"https:\/\/edu.ujhb.org\/?p=1156","title":{"rendered":"Spark\u5165\u9580\uff08\u65e5\u672c\u8a9e\u7248\uff09"},"content":{"rendered":"\n<p>IT\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u306e\u30d3\u30c7\u30aa\u3092\u3084\u308a\u305f\u3044\u306e\u3067\u3059\u304c\u3001\u30a6\u30a7\u30d6\u30b5\u30a4\u30c8\u306e\u30b3\u30f3\u30c6\u30f3\u30c4\u306b\u5fdc\u3058\u3066\u30d3\u30c7\u30aa\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u306e\u30c6\u30ad\u30b9\u30c8\u30b3\u30f3\u30c6\u30f3\u30c4\u3092\u914d\u7f6e\u3057\u3066\u3044\u305f\u3060\u3051\u307e\u3059\u304b\u3002 Spark\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u304b\u3089\u59cb\u3081\u307e\u3057\u3087\u3046\u3001\u305d\u3057\u3066\u3042\u306a\u305f\u306f\u79c1\u306b\u30ab\u30d0\u30fc\u3059\u308b\u3082\u306e\u306e\u30ea\u30b9\u30c8\u3092\u4e0e\u3048\u308b\u3053\u3068\u304b\u3089\u59cb\u3081\u307e\u3057\u3087\u3046\u3002<\/p>\n\n\n\n<p>\u78ba\u304b\u306b\u3042\u306a\u305f\u3092\u52a9\u3051\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 Spark \u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3067\u30ab\u30d0\u30fc\u3055\u308c\u308b\u53ef\u80fd\u6027\u306e\u3042\u308b\u30c8\u30d4\u30c3\u30af\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30b9\u30d1\u30fc\u30af\u306e\u5c0e\u5165\u3068\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb<\/li>\n\n\n\n<li>\u30b9\u30d1\u30fc\u30af\u30b3\u30a2\u30b3\u30f3\u30bb\u30d7\u30c8:<\/li>\n<\/ol>\n\n\n\n<p>a. RDD (Resilient Distributed Dataset)<\/p>\n\n\n\n<p>b.\u30b9\u30d1\u30fc\u30af SQL<\/p>\n\n\n\n<p>c. DataFrame\u548cDataSet<\/p>\n\n\n\n<p>d. \u30b9\u30d1\u30fc\u30af\u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0<\/p>\n\n\n\n<p>MLlib (\u6a5f\u68b0\u5b66\u7fd2\u30e9\u30a4\u30d6\u30e9\u30ea)<\/p>\n\n\n\n<p>f. GraphX (\u30b0\u30e9\u30d5\u51e6\u7406\u30e9\u30a4\u30d6\u30e9\u30ea)<\/p>\n\n\n\n<ol class=\"wp-block-list\" start=\"3\">\n<li>\u30b9\u30d1\u30fc\u30af\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e:<\/li>\n<\/ol>\n\n\n\n<p>\u3042\u308b\u3002\u68af\u5b50<\/p>\n\n\n\n<p>b.\u30b8\u30e3\u30ef<\/p>\n\n\n\n<p>c. \u30d1\u30a4\u30bd\u30f3<\/p>\n\n\n\n<ol class=\"wp-block-list\" start=\"4\">\n<li>Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3:<\/li>\n<\/ol>\n\n\n\n<p>a. \u30b9\u30d1\u30fc\u30af\u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3<\/p>\n\n\n\n<p>b.\u30cf\u30c9\u30a5\u30fc\u30d7\u7cf8<\/p>\n\n\n\n<p>c. \u30a2\u30d1\u30c3\u30c1\u6708\u9593<\/p>\n\n\n\n<ol class=\"wp-block-list\" start=\"5\">\n<li>\u30b9\u30d1\u30fc\u30af\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u958b\u767a:<\/li>\n<\/ol>\n\n\n\n<p>\u3042\u308b\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30e9\u30a4\u30d5\u30b5\u30a4\u30af\u30eb<\/p>\n\n\n\n<p>b. Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u69cb\u6210\u3059\u308b<\/p>\n\n\n\n<p>c. Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30c7\u30d7\u30ed\u30a4\u3068\u904b\u7528<\/p>\n\n\n\n<ol class=\"wp-block-list\" start=\"6\">\n<li>\u30b9\u30d1\u30fc\u30af\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u30c1\u30e5\u30fc\u30cb\u30f3\u30b0:<\/li>\n<\/ol>\n\n\n\n<p>a. \u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u306e\u6700\u9069\u5316<\/p>\n\n\n\n<p>b. \u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u306e\u6700\u9069\u5316<\/p>\n\n\n\n<p>c. Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u3092\u8abf\u6574\u3059\u308b<\/p>\n\n\n\n<ol class=\"wp-block-list\" start=\"7\">\n<li>\u30b9\u30d1\u30fc\u30af\u306e\u5b9f\u7528\u7684\u306a\u30b1\u30fc\u30b9:<\/li>\n<\/ol>\n\n\n\n<p>a. \u30c7\u30fc\u30bf\u51e6\u7406\u304a\u3088\u3073\u30af\u30ea\u30fc\u30cb\u30f3\u30b0\u306e\u305f\u3081\u306e Spark \u306e\u4f7f\u7528<\/p>\n\n\n\n<p>b. \u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u306e\u30c7\u30fc\u30bf\u5206\u6790\u306b Spark \u3092\u4f7f\u7528\u3059\u308b<\/p>\n\n\n\n<p>\u6a5f\u68b0\u5b66\u7fd2\u306b Spark \u3092\u4f7f\u7528\u3059\u308b<\/p>\n\n\n\n<p>d. \u30b0\u30e9\u30d5\u30a3\u30c3\u30af\u30b9\u51e6\u7406\u306b Spark \u3092\u4f7f\u7528\u3059\u308b<\/p>\n\n\n\n<p>\u3053\u308c\u306f\u5358\u306a\u308b\u6982\u8981\u3067\u3042\u308a\u3001\u30cb\u30fc\u30ba\u3068\u76ee\u6a19\u306b\u57fa\u3065\u3044\u3066\u30ab\u30d0\u30fc\u3059\u308b\u30c8\u30d4\u30c3\u30af\u3092\u9078\u629e\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u306e\u5c0e\u5165\u3068\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3001\u3042\u306a\u305f\u3092\u52a9\u3051\u308b\u3053\u3068\u304c\u3067\u304d\u308b\u306e\u306f\u559c\u3073\u3067\u3059\u3002 Spark \u306e\u6982\u8981\u3068\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u306e\u624b\u9806\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u30b9\u30d1\u30fc\u30af\u306b\u3064\u3044\u3066:<\/p>\n\n\n\n<p>Spark \u306f\u3001\u30a4\u30f3\u30e1\u30e2\u30ea\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u3068\u9ad8\u30ec\u30d9\u30eb\u306e\u62bd\u8c61\u5316\u3092\u901a\u3058\u3066 Hadoop MapReduce \u306e\u52b9\u7387\u3092\u5411\u4e0a\u3055\u305b\u308b\u3088\u3046\u306b\u8a2d\u8a08\u3055\u308c\u305f\u3001\u9ad8\u901f\u3067\u6c4e\u7528\u7684\u3067\u30b9\u30b1\u30fc\u30e9\u30d6\u30eb\u306a\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0 \u30b7\u30b9\u30c6\u30e0\u3067\u3059\u3002 Spark \u306f\u3001\u30d0\u30c3\u30c1\u51e6\u7406\u3001\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0 \u30b9\u30c8\u30ea\u30fc\u30e0\u51e6\u7406\u3001\u6a5f\u68b0\u5b66\u7fd2\u3001\u30b0\u30e9\u30d5\u51e6\u7406\u306a\u3069\u306e\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u51e6\u7406\u306b\u4f7f\u7528\u3067\u304d\u307e\u3059\u3002 Spark \u306f\u3001\u6570\u767e\u53f0\u306e\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u30fc\u3067\u5b9f\u884c\u3055\u308c\u308b\u9ad8\u5ea6\u306b\u30b9\u30b1\u30fc\u30e9\u30d6\u30eb\u306a\u30af\u30e9\u30b9\u30bf\u30fc\u7ba1\u7406\u30b7\u30b9\u30c6\u30e0\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u30b9\u30d1\u30fc\u30af\u306e\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb:<\/p>\n\n\n\n<p>Spark \u3092\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3059\u308b\u624b\u9806\u306f\u6b21\u306e\u3068\u304a\u308a\u3067\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u524d\u63d0\u6761\u4ef6: \u307e\u305a\u3001Java JDK \u3068 Scala \u74b0\u5883\u304c\u5fc5\u8981\u3067\u3059\u3002 \u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3057\u3066\u3044\u306a\u3044\u5834\u5408\u306f\u3001\u6700\u521d\u306b\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3057\u307e\u3059\u3002 Python \u3067 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u4f5c\u6210\u3059\u308b\u5834\u5408\u306f\u3001Python \u74b0\u5883\u304c\u5fc5\u8981\u3067\u3059\u3002<\/li>\n\n\n\n<li>Spark\u3092\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9:Spark\u306e\u516c\u5f0fWeb\u30b5\u30a4\u30c8\u304b\u3089Spark\u3092\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\u3067\u304d\u307e\u3059\u3002 \u30aa\u30da\u30ec\u30fc\u30c6\u30a3\u30f3\u30b0\u30b7\u30b9\u30c6\u30e0\u306b\u9069\u3057\u305f\u30d0\u30fc\u30b8\u30e7\u30f3\u3092\u9078\u629e\u3057\u3066\u304f\u3060\u3055\u3044\u3001\u305f\u3068\u3048\u3070\u3001Linux\u3092\u4f7f\u7528\u3057\u3066\u3044\u308b\u5834\u5408\u306f\u3001.tgz\u30d5\u30a1\u30a4\u30eb\u3092\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>Spark \u3092\u89e3\u51cd\u3059\u308b: \u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\u3057\u305f Spark \u30d5\u30a1\u30a4\u30eb\u3092\u3001\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3059\u308b\u30c7\u30a3\u30ec\u30af\u30c8\u30ea\u306b\u89e3\u51cd\u3057\u307e\u3059\u3002 \u305f\u3068\u3048\u3070\u3001\/home\/user\/spark \u30c7\u30a3\u30ec\u30af\u30c8\u30ea\u306b\u89e3\u51cd\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>Spark \u3092\u69cb\u6210\u3059\u308b: Spark \u304c\u6b63\u3057\u304f\u6a5f\u80fd\u3059\u308b\u306b\u306f\u3001\u3044\u304f\u3064\u304b\u306e\u69cb\u6210\u304c\u5fc5\u8981\u3067\u3059\u3002 Spark \u30a4\u30f3\u30b9\u30c8\u30fc\u30eb \u30c7\u30a3\u30ec\u30af\u30c8\u30ea\u306e conf \u30c7\u30a3\u30ec\u30af\u30c8\u30ea\u3092\u958b\u304d\u3001spark-env.sh.template \u30d5\u30a1\u30a4\u30eb\u3092\u30b3\u30d4\u30fc\u3057\u3001\u540d\u524d\u3092 spark-env.sh \u306b\u5909\u66f4\u3057\u307e\u3059\u3002 spark-env.sh \u30d5\u30a1\u30a4\u30eb\u3092\u7de8\u96c6\u3057\u3001SPARK_HOME\u5909\u6570\u3068JAVA_HOME\u5909\u6570\u3092\u8a2d\u5b9a\u3057\u307e\u3059\u3002 \u4f8b\u3048\u3070\uff1a<\/li>\n<\/ol>\n\n\n\n<pre class=\"wp-block-preformatted\">javascriptCopy code<code>export SPARK_HOME=\/home\/user\/spark\nexport JAVA_HOME=\/usr\/lib\/jvm\/java-1.8.0-openjdk-amd64\n<\/code><\/pre>\n\n\n\n<ol class=\"wp-block-list\" start=\"5\">\n<li>Spark \u306e\u958b\u59cb: \u6b21\u306e\u30b3\u30de\u30f3\u30c9\u3067 Spark \u3092\u8d77\u52d5\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<pre class=\"wp-block-preformatted\">bashCopy code<code>$SPARK_HOME\/bin\/spark-shell\n<\/code><\/pre>\n\n\n\n<p>\u3053\u308c\u306b\u3088\u308a\u3001Spark \u30b7\u30a7\u30eb\u304c\u8d77\u52d5\u3057\u3001Spark \u30bf\u30b9\u30af\u3068\u30af\u30a8\u30ea\u3092\u5b9f\u884c\u3067\u304d\u307e\u3059\u3002 \u30af\u30e9\u30b9\u30bf\u30fc\u3067 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u884c\u3059\u308b\u5fc5\u8981\u304c\u3042\u308b\u5834\u5408\u306f\u3001Spark \u30af\u30e9\u30b9\u30bf\u30fc\u3092\u69cb\u6210\u3057\u3001Spark \u30de\u30b9\u30bf\u30fc \u30ce\u30fc\u30c9\u3068\u30ef\u30fc\u30ab\u30fc \u30ce\u30fc\u30c9\u3092\u8d77\u52d5\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>RDD\u306e\u30b3\u30a2\u30b3\u30f3\u30bb\u30d7\u30c8\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u6559\u3048\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3001Spark\u306e\u30b3\u30a2\u30b3\u30f3\u30bb\u30d7\u30c8\u3067RDD\u3092\u8aac\u660e\u3067\u304d\u308b\u3068\u4fbf\u5229\u3067\u3059\u3002 RDD \u306f Spark \u306e\u91cd\u8981\u306a\u6982\u5ff5\u3067\u3042\u308a\u3001\u4e0d\u5909\u3067\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u5206\u5272\u53ef\u80fd\u306a\u4e26\u5217\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0 \u30c7\u30fc\u30bf \u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3092\u8868\u3057\u3001\u30e1\u30e2\u30ea\u5185\u306e Spark \u306e\u57fa\u672c\u7684\u306a\u30c7\u30fc\u30bf \u30e2\u30c7\u30eb\u3067\u3059\u3002<\/p>\n\n\n\n<p>RDD (Elastic Distributed Dataset) \u306f\u3001\u5206\u6563\u30c7\u30fc\u30bf\u306e\u57fa\u672c\u7684\u306a\u62bd\u8c61\u5316\u3067\u3042\u308b Spark \u306e\u30b3\u30a2\u6982\u5ff5\u306e 1 \u3064\u3067\u3059\u3002 \u3053\u308c\u306f\u3001\u8ad6\u7406\u533a\u753b\u306b\u5206\u5272\u3067\u304d\u308b\u8aad\u307f\u53d6\u308a\u5c02\u7528\u306e\u5206\u6563\u30c7\u30fc\u30bf\u30fb\u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3067\u3042\u308a\u3001\u533a\u753b\u5185\u306e\u30a8\u30ec\u30e1\u30f3\u30c8\u306f\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u7570\u306a\u308b\u30ce\u30fc\u30c9\u306b\u4fdd\u7ba1\u3055\u308c\u308b\u305f\u3081\u3001\u4e26\u5217\u51e6\u7406\u304c\u53ef\u80fd\u3067\u3059\u3002 RDD \u306f\u3001Hadoop \u30d5\u30a1\u30a4\u30eb \u30b7\u30b9\u30c6\u30e0 (HDFS) \u306e\u30d5\u30a1\u30a4\u30eb\u307e\u305f\u306f Spark \u64cd\u4f5c\u304b\u3089\u4f5c\u6210\u3067\u304d\u3001RDD \u3067\u4e26\u5217\u306b\u5b9f\u884c\u3067\u304d\u308b\u3055\u307e\u3056\u307e\u306a\u64cd\u4f5c (\u5909\u63db\u3084\u64cd\u4f5c\u306a\u3069) \u3092\u30b5\u30dd\u30fc\u30c8\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>RDD \u306b\u306f\u3001\u6b21\u306e\u4e3b\u306a\u6a5f\u80fd\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u4e0d\u5909\u6027: RDD \u306f\u3001\u4e00\u5ea6\u4f5c\u6210\u3055\u308c\u308b\u3068\u5909\u66f4\u3067\u304d\u307e\u305b\u3093\u3002 RDD \u3092\u5909\u66f4\u3059\u308b\u5fc5\u8981\u304c\u3042\u308b\u5834\u5408\u306f\u3001\u65b0\u3057\u3044 RDD \u3092\u4f5c\u6210\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u5206\u5272: RDD \u306f\u8907\u6570\u306e\u8ad6\u7406\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u306b\u5206\u5272\u3067\u304d\u3001\u5404\u8ad6\u7406\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u306f\u30af\u30e9\u30b9\u30bf\u30fc\u5185\u306e\u7570\u306a\u308b\u30ce\u30fc\u30c9\u3067\u4e26\u5217\u306b\u51e6\u7406\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u5e45\u306e\u5e83\u3044\u4f9d\u5b58\u95a2\u4fc2\u3068\u72ed\u3044\u4f9d\u5b58\u95a2\u4fc2: Spark \u3067\u306f\u3001\u4f9d\u5b58\u95a2\u4fc2\u3092\u4f7f\u7528\u3057\u3066 RDD \u9593\u306e\u95a2\u4fc2\u3092\u6587\u66f8\u5316\u3057\u3001\u5e45\u306e\u5e83\u3044\u4f9d\u5b58\u95a2\u4fc2\u3068\u72ed\u3044\u4f9d\u5b58\u95a2\u4fc2\u306e 2 \u7a2e\u985e\u304c\u3042\u308a\u307e\u3059\u3002 \u4f9d\u5b58\u95a2\u4fc2\u304c\u5e83\u3044\u3068\u3044\u3046\u3053\u3068\u306f\u3001\u89aa RDD \u304c\u8907\u6570\u306e\u5b50 RDD \u306b\u4f9d\u5b58\u3057\u3066\u3044\u308b\u3053\u3068\u3092\u610f\u5473\u3057\u3001\u4f9d\u5b58\u95a2\u4fc2\u304c\u72ed\u3044\u3068\u306f\u3001\u5404\u89aa RDD \u306e\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u304c\u6700\u5927 1 \u3064\u306e\u5b50 RDD \u306b\u3088\u3063\u3066\u4f7f\u7528\u3055\u308c\u308b\u3053\u3068\u3092\u610f\u5473\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u9045\u5ef6\u5b9f\u884c: Spark \u306e\u64cd\u4f5c\u306f\u9045\u5ef6\u5b9f\u884c\u3067\u3042\u308a\u3001\u3059\u3050\u306b\u306f\u5b9f\u884c\u3055\u308c\u307e\u305b\u3093\u304c\u3001\u7d50\u679c\u304c\u5fc5\u8981\u306b\u306a\u308b\u307e\u3067\u5f85\u6a5f\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30ad\u30e3\u30c3\u30b7\u30e5: Spark \u3067\u306f\u3001\u30a2\u30af\u30bb\u30b9\u3068\u518d\u5229\u7528\u3092\u9ad8\u901f\u5316\u3059\u308b\u305f\u3081\u306b\u3001\u30e1\u30e2\u30ea\u5185\u306e RDD \u306e\u30ad\u30e3\u30c3\u30b7\u30e5\u304c\u30b5\u30dd\u30fc\u30c8\u3055\u308c\u3066\u3044\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Spark \u3067\u306f\u3001RDD \u3067\u5b9f\u884c\u3067\u304d\u308b\u3055\u307e\u3056\u307e\u306a\u64cd\u4f5c\u304c\u30b5\u30dd\u30fc\u30c8\u3055\u308c\u3066\u304a\u308a\u3001\u5909\u63db\u3068\u64cd\u4f5c\u306e 2 \u3064\u306e\u30ab\u30c6\u30b4\u30ea\u306b\u5206\u985e\u3055\u308c\u307e\u3059\u3002 \u5909\u63db\u64cd\u4f5c\u306f 1 \u3064\u306e RDD \u3092\u5225\u306e RDD \u306b\u5909\u63db\u3057\u3001\u64cd\u4f5c\u306f 1 \u3064\u306e RDD \u3092\u8a55\u4fa1\u3057\u3066\u7d50\u679c\u3092\u8fd4\u3057\u307e\u3059\u3002 \u4e00\u822c\u7684\u306b\u4f7f\u7528\u3055\u308c\u308b RDD \u64cd\u4f5c\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>map: RDD \u306e\u5404\u8981\u7d20\u306b\u95a2\u6570\u3092\u9069\u7528\u3057\u3066\u3001\u65b0\u3057\u3044 RDD \u3092\u751f\u6210\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>filter: \u6307\u5b9a\u3057\u305f\u6761\u4ef6\u3092\u6e80\u305f\u3059 RDD \u8981\u7d20\u3092\u542b\u3080\u65b0\u3057\u3044 RDD \u3092\u8fd4\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>flatMap: RDD \u306e\u5404\u8981\u7d20\u306b\u95a2\u6570\u3092\u9069\u7528\u3057\u3001\u30b7\u30fc\u30b1\u30f3\u30b9\u3092\u751f\u6210\u3057\u3001\u3059\u3079\u3066\u306e\u30b7\u30fc\u30b1\u30f3\u30b9\u3092 1 \u3064\u306e RDD \u306b\u30de\u30fc\u30b8\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>reduceByKey: \u30ad\u30fc\u3092\u62bc\u3057\u3066RDD\u5185\u306e\u8981\u7d20\u3092\u30b0\u30eb\u30fc\u30d7\u5316\u3057\u3001\u524a\u6e1b\u64cd\u4f5c\u3092\u5b9f\u884c\u3057\u3066\u3001(\u30ad\u30fc\u3001\u5024)\u30da\u30a2\u3092\u542b\u3080\u65b0\u3057\u3044RDD\u3092\u8fd4\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30ab\u30a6\u30f3\u30c8: RDD \u5185\u306e\u8981\u7d20\u306e\u6570\u3092\u8fd4\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u53ce\u96c6: RDD \u5185\u306e\u3059\u3079\u3066\u306e\u8981\u7d20\u3092\u8fd4\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u3053\u308c\u3089\u306e\u64cd\u4f5c\u306f RDD \u64cd\u4f5c\u306e\u307b\u3093\u306e\u4e00\u90e8\u3067\u3059\u304c\u3001Spark \u306e\u5a01\u529b\u3092\u77e5\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark \u30de\u30c3\u30d7\u64cd\u4f5c\u306e\u4f8b\u306f\u3001\u305d\u308c\u305e\u308c Scala \u3068 Python \u3067\u793a\u3055\u308c\u3066\u3044\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Scala \u3068 Python \u3067\u306e Spark \u30de\u30c3\u30d7\u64cd\u4f5c\u306e\u4f8b\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u30b9\u30ab\u30e9\u306e\u4f8b:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>val nums = sc.parallelize(Seq(1, 2, 3, 4, 5))\nval squared = nums.map(x =&gt; x * x)\nsquared.foreach(println)\n<\/code><\/pre>\n\n\n\n<p>\u8aac\u660e:\u307e\u305a\u3001\u4e26\u5217\u5316\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u4e00\u9023\u306e\u6570\u5024\u3092\u542b\u3080RDD\u3092\u4f5c\u6210\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30de\u30c3\u30d7\u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066RDD\u306e\u5404\u8981\u7d20\u3092\u4e8c\u4e57\u3057\u3001\u305d\u306e\u7d50\u679c\u3092\u65b0\u3057\u3044RDD\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001foreach \u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066\u3001\u65b0\u3057\u3044 RDD \u306e\u8981\u7d20\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u30d1\u30a4\u30bd\u30f3\u306e\u4f8b:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pythonCopy code<code>nums = sc.parallelize([1, 2, 3, 4, 5])\nsquared = nums.map(lambda x: x * x)\nsquared.foreach(print)\n<\/code><\/pre>\n\n\n\n<p>\u8aac\u660e:\u307e\u305a\u3001\u4e26\u5217\u5316\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u4e00\u9023\u306e\u6570\u5024\u3092\u542b\u3080RDD\u3092\u4f5c\u6210\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30de\u30c3\u30d7\u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066RDD\u306e\u5404\u8981\u7d20\u3092\u4e8c\u4e57\u3057\u3001\u305d\u306e\u7d50\u679c\u3092\u65b0\u3057\u3044RDD\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001foreach \u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066\u3001\u65b0\u3057\u3044 RDD \u306e\u8981\u7d20\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 Python \u3067\u306f\u3001\u30e9\u30e0\u30c0\u95a2\u6570\u3092\u4f7f\u7528\u3057\u3066\u30de\u30c3\u30d7\u64cd\u4f5c\u306e\u95a2\u6570\u3092\u5b9a\u7fa9\u3059\u308b\u3053\u3068\u306b\u6ce8\u610f\u3057\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark \u304c\u305d\u308c\u305e\u308c Scala \u3068 Python \u3067\u64cd\u4f5c\u3092\u6e1b\u3089\u3059\u4f8b<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Scala \u3068 Python \u3067\u306e Spark \u524a\u6e1b\u64cd\u4f5c\u306e\u4f8b\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u30b9\u30ab\u30e9\u306e\u4f8b:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>val nums = sc.parallelize(Seq(1, 2, 3, 4, 5))\nval sum = nums.reduce((a, b) =&gt; a + b)\nprintln(sum)\n<\/code><\/pre>\n\n\n\n<p>\u8aac\u660e:\u307e\u305a\u3001\u4e26\u5217\u5316\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u4e00\u9023\u306e\u6570\u5024\u3092\u542b\u3080RDD\u3092\u4f5c\u6210\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001reduce \u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066 RDD \u5185\u306e\u3059\u3079\u3066\u306e\u8981\u7d20\u3092\u5408\u8a08\u3057\u3001\u305d\u306e\u7d50\u679c\u3092\u5909\u6570 sum \u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001println \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u5909\u6570 sum \u306e\u5024\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u30d1\u30a4\u30bd\u30f3\u306e\u4f8b:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pythonCopy code<code>nums = sc.parallelize([1, 2, 3, 4, 5])\nsum = nums.reduce(lambda a, b: a + b)\nprint(sum)\n<\/code><\/pre>\n\n\n\n<p>\u8aac\u660e:\u307e\u305a\u3001\u4e26\u5217\u5316\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u4e00\u9023\u306e\u6570\u5024\u3092\u542b\u3080RDD\u3092\u4f5c\u6210\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001reduce \u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066 RDD \u5185\u306e\u3059\u3079\u3066\u306e\u8981\u7d20\u3092\u5408\u8a08\u3057\u3001\u305d\u306e\u7d50\u679c\u3092\u5909\u6570 sum \u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001print \u95a2\u6570\u3092\u4f7f\u7528\u3057\u3066\u5909\u6570 sum \u306e\u5024\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 Python \u3067\u306f\u3001\u30e9\u30e0\u30c0\u95a2\u6570\u3092\u4f7f\u7528\u3057\u3066\u524a\u6e1b\u64cd\u4f5c\u306e\u95a2\u6570\u3092\u5b9a\u7fa9\u3059\u308b\u3053\u3068\u306b\u6ce8\u610f\u3057\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark SQL\u306e\u30b3\u30a2\u30b3\u30f3\u30bb\u30d7\u30c8\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark SQL \u306f\u3001\u8868\u5f62\u5f0f\u30c7\u30fc\u30bf\u306a\u3069\u306e\u69cb\u9020\u5316\u30c7\u30fc\u30bf\u3068 Spark \u306e\u30a8\u30e9\u30b9\u30c6\u30a3\u30c3\u30af\u5206\u6563\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8 (RDD) \u3092\u7d44\u307f\u5408\u308f\u305b\u305f Spark \u306e\u30b3\u30f3\u30dd\u30fc\u30cd\u30f3\u30c8\u3067\u3059\u3002 Spark SQL \u3067\u306f\u3001SQL \u30af\u30a8\u30ea\u3001\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0 API\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8 API \u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u306b\u30a2\u30af\u30bb\u30b9\u3057\u3066\u64cd\u4f5c\u3067\u304d\u307e\u3059\u3002 Spark SQL \u306f\u3001Hive\u3001Avro\u3001Parquet\u3001JSON \u306a\u3069\u306e\u5f62\u5f0f\u3092\u542b\u3080\u8907\u6570\u306e\u30c7\u30fc\u30bf \u30bd\u30fc\u30b9\u3068\u7d71\u5408\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark SQL \u306e\u4e3b\u8981\u306a\u6982\u5ff5\u306e\u4e00\u90e8\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0: \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306f\u3001\u540d\u524d\u4ed8\u304d\u5217\u3068\u63a8\u8ad6\u53ef\u80fd\u306a\u30b9\u30ad\u30fc\u30de\u3092\u6301\u3064\u30c7\u30fc\u30bf\u306e\u5206\u6563\u30c6\u30fc\u30d6\u30eb\u3067\u3042\u308b Spark SQL \u306e\u4e3b\u8981\u306a\u6982\u5ff5\u306e 1 \u3064\u3067\u3059\u3002 \u3053\u308c\u306f\u3001\u30ea\u30ec\u30fc\u30b7\u30e7\u30ca\u30eb \u30c7\u30fc\u30bf\u30d9\u30fc\u30b9\u306e\u30c6\u30fc\u30d6\u30eb\u306b\u4f3c\u305f\u30ea\u30ec\u30fc\u30b7\u30e7\u30ca\u30eb \u30c7\u30fc\u30bf\u306e\u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3068\u8003\u3048\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u30a2\u30af\u30bb\u30b9\u3057\u3066\u64cd\u4f5c\u3059\u308b\u306b\u306f\u3001\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0 API \u307e\u305f\u306f Spark SQL \u30b9\u30c6\u30fc\u30c8\u30e1\u30f3\u30c8\u3092\u4f7f\u7528\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8: \u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306f\u3001\u53b3\u5bc6\u306b\u578b\u6307\u5b9a\u3055\u308c\u305f\u5206\u6563\u30c7\u30fc\u30bf \u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3067\u3042\u308b Spark SQL \u306e\u62e1\u5f35\u6a5f\u80fd\u3067\u3059\u3002 DataFrame \u3068\u306f\u7570\u306a\u308a\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u4f7f\u7528\u3059\u308b\u3068\u3001\u30b3\u30f3\u30d1\u30a4\u30eb\u6642\u306b\u30bf\u30a4\u30d7 \u30bb\u30fc\u30d5\u3092\u30c1\u30a7\u30c3\u30af\u3057\u3001\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u6307\u5411\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0 \u30e2\u30c7\u30eb\u3092\u4f7f\u7528\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>SQLContext: SQLContext \u306f Spark SQL \u3078\u306e\u30a8\u30f3\u30c8\u30ea \u30dd\u30a4\u30f3\u30c8\u3067\u3042\u308a\u3001DataFrame \u3068 Dataset \u306b\u30a2\u30af\u30bb\u30b9\u3059\u308b\u305f\u3081\u306e\u30e1\u30bd\u30c3\u30c9\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002 SQLContext \u3092\u4f7f\u7528\u3059\u308b\u3068\u3001\u3055\u307e\u3056\u307e\u306a\u30c7\u30fc\u30bf \u30bd\u30fc\u30b9\u304b\u3089\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u307e\u305f\u306f\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306b\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001SQL \u30af\u30a8\u30ea\u3092\u5b9f\u884c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>Catalyst \u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6:Catalyst \u306f Spark SQL \u306e\u30af\u30a8\u30ea\u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6\u3067\u3001\u30eb\u30fc\u30eb\u3068\u6700\u9069\u5316\u624b\u6cd5\u3092\u4f7f\u7528\u3057\u3066\u30af\u30a8\u30ea\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u5411\u4e0a\u3055\u305b\u307e\u3059\u3002 Catalyst \u306f\u3001SQL \u30af\u30a8\u30ea\u3092\u81ea\u52d5\u7684\u306b\u6700\u9069\u5316\u3057\u3001\u52b9\u7387\u7684\u306a\u5b9f\u884c\u30d7\u30e9\u30f3\u3092\u751f\u6210\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30c7\u30fc\u30bf\u30bd\u30fc\u30b9 API: Spark SQL \u306b\u306f\u3001\u30d7\u30e9\u30b0\u30a4\u30f3\u3092\u8a18\u8ff0\u3059\u308b\u3053\u3068\u3067\u3055\u307e\u3056\u307e\u306a\u30c7\u30fc\u30bf\u30bd\u30fc\u30b9\u3092 Spark SQL \u3068\u7d71\u5408\u3067\u304d\u308b\u30c7\u30fc\u30bf\u30bd\u30fc\u30b9 API \u304c\u7528\u610f\u3055\u308c\u3066\u3044\u307e\u3059\u3002 \u30c7\u30fc\u30bf\u30bd\u30fc\u30b9 API \u306f\u3001\u69cb\u9020\u5316\u30c7\u30fc\u30bf\u3084\u534a\u69cb\u9020\u5316\u30c7\u30fc\u30bf\u306a\u3069\u306e\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u66f8\u304d\u3059\u308b\u305f\u3081\u306e\u5171\u901a\u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30fc\u30b9\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Spark SQL \u306f\u3001Hadoop Distributed File System (HDFS)\u3001Apache Hive\u3001Apache Cassandra\u3001Apache HBase\u3001Amazon S3\u3001JDBC \u306a\u3069\u3001\u3055\u307e\u3056\u307e\u306a\u30c7\u30fc\u30bf \u30bd\u30fc\u30b9\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002 Spark SQL \u3092\u4f7f\u7528\u3059\u308b\u3068\u3001\u3055\u307e\u3056\u307e\u306a\u30c7\u30fc\u30bf \u30bd\u30fc\u30b9\u3068\u5f62\u5f0f\u3092\u7c21\u5358\u306b\u7d44\u307f\u5408\u308f\u305b\u3001SQL \u307e\u305f\u306f\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0 API \u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u306b\u30a2\u30af\u30bb\u30b9\u3057\u3066\u64cd\u4f5c\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u306e\u30b3\u30a2\u30b3\u30f3\u30bb\u30d7\u30c8DataFrame\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3001\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6559\u3048\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306f\u3001\u63a8\u8ad6\u53ef\u80fd\u306a\u30d1\u30bf\u30fc\u30f3\u3092\u6301\u3064\u540d\u524d\u4ed8\u304d\u5217\u306e\u30bb\u30c3\u30c8\u3067\u69cb\u6210\u3055\u308c\u308b\u30c7\u30fc\u30bf\u306e\u5206\u6563\u30c6\u30fc\u30d6\u30eb\u3067\u3042\u308b Spark SQL \u306e\u4e2d\u6838\u3068\u306a\u308b\u6982\u5ff5\u3067\u3059\u3002 DataFrame \u306f\u3001\u30ea\u30ec\u30fc\u30b7\u30e7\u30ca\u30eb \u30c7\u30fc\u30bf\u30d9\u30fc\u30b9\u306e\u30c6\u30fc\u30d6\u30eb\u306b\u4f3c\u305f\u69cb\u9020\u5316\u30c7\u30fc\u30bf\u306e\u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3068\u8003\u3048\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0 API \u307e\u305f\u306f Spark SQL \u30af\u30a8\u30ea\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u30a2\u30af\u30bb\u30b9\u3057\u3066\u64cd\u4f5c\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306e\u91cd\u8981\u306a\u7279\u6027\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u4e0d\u5909\u6027: RDD \u3068\u540c\u69d8\u306b\u3001\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306f\u4e0d\u5909\u3067\u3042\u308a\u3001\u4e00\u5ea6\u4f5c\u6210\u3059\u308b\u3068\u5909\u66f4\u3067\u304d\u307e\u305b\u3093\u3002 \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092\u5909\u66f4\u3059\u308b\u5fc5\u8981\u304c\u3042\u308b\u5834\u5408\u306f\u3001\u65b0\u3057\u3044\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092\u4f5c\u6210\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u5206\u5272\u53ef\u80fd: DataFrame \u306f\u8907\u6570\u306e\u8ad6\u7406\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u306b\u5206\u5272\u3067\u304d\u3001\u5404\u8ad6\u7406\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u306f\u30af\u30e9\u30b9\u30bf\u30fc\u5185\u306e\u7570\u306a\u308b\u30ce\u30fc\u30c9\u3067\u4e26\u5217\u306b\u51e6\u7406\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30ad\u30e3\u30c3\u30b7\u30e5\u53ef\u80fd: RDD \u3068\u540c\u69d8\u306b\u3001DataFrame \u3092\u30e1\u30e2\u30ea\u306b\u30ad\u30e3\u30c3\u30b7\u30e5\u3057\u3066\u3001\u30a2\u30af\u30bb\u30b9\u3068\u518d\u5229\u7528\u3092\u9ad8\u901f\u5316\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u9045\u5ef6\u5b9f\u884c:Spark\u306e\u64cd\u4f5c\u306f\u5b9f\u884c\u304c\u9045\u5ef6\u3057\u307e\u3059\u3001\u3064\u307e\u308a\u3001\u3059\u3050\u306b\u306f\u5b9f\u884c\u3055\u308c\u307e\u305b\u3093\u304c\u3001\u7d50\u679c\u304c\u5fc5\u8981\u306b\u306a\u308b\u307e\u3067\u5f85\u6a5f\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u4ee5\u4e0b\u306f\u3001\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0API\u3092\u4f7f\u7528\u3057\u3066CSV\u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u8fbc\u307f\u3001\u4f55\u304b\u3092\u884c\u3046\u5358\u7d14\u306aScala\u30d7\u30ed\u30b0\u30e9\u30e0\u3067\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>import org.apache.spark.sql.SparkSession\n\nobject DataFrameExample {\n  def main(args: Array[String]): Unit = {\n    val spark = SparkSession.builder()\n      .appName(\"DataFrameExample\")\n      .master(\"local[*]\")\n      .getOrCreate()\n\n    val df = spark.read\n      .option(\"header\", \"true\")\n      .option(\"inferSchema\", \"true\")\n      .csv(\"path\/to\/file.csv\")\n\n    df.printSchema()\n\n    val filtered = df.filter(\"age &gt; 30\")\n    filtered.show()\n\n    val grouped = df.groupBy(\"gender\").count()\n    grouped.show()\n\n    spark.stop()\n  }\n}\n<\/code><\/pre>\n\n\n\n<p>\u3053\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001SparkSession API \u3092\u4f7f\u7528\u3057\u3066 SparkSession \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u4f5c\u6210\u3057\u3001\u8aad\u307f\u53d6\u308a\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u305d\u306e\u7d50\u679c\u3092\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306e\u30b9\u30ad\u30fc\u30de\u60c5\u5831\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001filter \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 30 \u5e74\u4ee5\u4e0a\u524d\u306e\u884c\u3092\u9664\u5916\u3057\u3001show \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30d5\u30a3\u30eb\u30bf\u30fc\u7d50\u679c\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001groupBy \u30e1\u30bd\u30c3\u30c9\u3068 count \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u6027\u5225\u5225\u306b\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092\u30b0\u30eb\u30fc\u30d7\u5316\u3057\u3001\u5404\u30b0\u30eb\u30fc\u30d7\u306e\u884c\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3057\u3001show \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30b0\u30eb\u30fc\u30d7\u5316\u7d50\u679c\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Python \u3067\u306f\u3001DataFrame \u306e API \u306f Scala \u306e API \u3068\u306f\u5c11\u3057\u7570\u306a\u308a\u307e\u3059\u304c\u3001\u57fa\u672c\u7684\u306a\u6982\u5ff5\u306f\u540c\u3058\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u306e\u30b3\u30a2\u30b3\u30f3\u30bb\u30d7\u30c8\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3001\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6559\u3048\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 \u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306f\u3001\u53b3\u5bc6\u306b\u578b\u6307\u5b9a\u3055\u308c\u305f\u5206\u6563\u30c7\u30fc\u30bf \u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3067\u3042\u308b Spark SQL \u306e\u62e1\u5f35\u6a5f\u80fd\u3067\u3059\u3002 DataFrame \u3068\u306f\u7570\u306a\u308a\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u4f7f\u7528\u3059\u308b\u3068\u3001\u30b3\u30f3\u30d1\u30a4\u30eb\u6642\u306b\u30bf\u30a4\u30d7 \u30bb\u30fc\u30d5\u3092\u30c1\u30a7\u30c3\u30af\u3057\u3001\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u6307\u5411\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0 \u30e2\u30c7\u30eb\u3092\u4f7f\u7528\u3067\u304d\u307e\u3059\u3002 \u30c7\u30fc\u30bf\u30bb\u30c3\u30c8 API \u307e\u305f\u306f Spark SQL \u30af\u30a8\u30ea\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306b\u30a2\u30af\u30bb\u30b9\u3057\u3066\u64cd\u4f5c\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306e\u91cd\u8981\u306a\u7279\u6027\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u53b3\u5bc6\u306b\u578b\u6307\u5b9a\u3055\u308c\u305f: \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3068\u306f\u7570\u306a\u308a\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306f\u53b3\u5bc6\u306b\u578b\u6307\u5b9a\u3055\u308c\u305f\u30c7\u30fc\u30bf\u306e\u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3067\u3059\u3002 Scala\u3084Java\u306e\u30af\u30e9\u30b9\u306b\u5bfe\u5fdc\u3057\u3066\u3044\u308b\u306e\u3067\u3001\u30b3\u30f3\u30d1\u30a4\u30eb\u6642\u306e\u578b\u30c1\u30a7\u30c3\u30af\u304c\u53ef\u80fd\u3067\u3059\u3002<\/li>\n\n\n\n<li>\u30ad\u30e3\u30c3\u30b7\u30e5\u53ef\u80fd: DataFrame \u3068\u540c\u69d8\u306b\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u30e1\u30e2\u30ea\u306b\u30ad\u30e3\u30c3\u30b7\u30e5\u3057\u3066\u3001\u30a2\u30af\u30bb\u30b9\u3068\u518d\u5229\u7528\u3092\u9ad8\u901f\u5316\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u5206\u5272\u53ef\u80fd: \u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306f\u8907\u6570\u306e\u8ad6\u7406\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u306b\u5206\u5272\u3067\u304d\u3001\u5404\u8ad6\u7406\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u306f\u30af\u30e9\u30b9\u30bf\u30fc\u5185\u306e\u7570\u306a\u308b\u30ce\u30fc\u30c9\u3067\u4e26\u5217\u306b\u51e6\u7406\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u9045\u5ef6\u5b9f\u884c:Spark\u306e\u64cd\u4f5c\u306f\u5b9f\u884c\u304c\u9045\u5ef6\u3057\u307e\u3059\u3001\u3064\u307e\u308a\u3001\u3059\u3050\u306b\u306f\u5b9f\u884c\u3055\u308c\u307e\u305b\u3093\u304c\u3001\u7d50\u679c\u304c\u5fc5\u8981\u306b\u306a\u308b\u307e\u3067\u5f85\u6a5f\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8 API \u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u8fbc\u307f\u3001\u4f55\u304b\u3092\u884c\u3046\u5358\u7d14\u306a Scala \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>import org.apache.spark.sql.{Encoder, Encoders, SparkSession}\n\ncase class Person(name: String, age: Int, gender: String)\n\nobject DatasetExample {\n  def main(args: Array[String]): Unit = {\n    val spark = SparkSession.builder()\n      .appName(\"DatasetExample\")\n      .master(\"local[*]\")\n      .getOrCreate()\n\n    implicit val encoder: Encoder[Person] = Encoders.product[Person]\n\n    val ds = spark.read\n      .option(\"header\", \"true\")\n      .option(\"inferSchema\", \"true\")\n      .csv(\"path\/to\/file.csv\")\n      .as[Person]\n\n    ds.printSchema()\n\n    val filtered = ds.filter(p =&gt; p.age &gt; 30)\n    filtered.show()\n\n    val grouped = ds.groupBy(\"gender\").count()\n    grouped.show()\n\n    spark.stop()\n  }\n}\n<\/code><\/pre>\n\n\n\n<p>\u3053\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001SparkSession API \u3092\u4f7f\u7528\u3057\u3066 SparkSession \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u4f5c\u6210\u3057\u3001\u8aad\u307f\u53d6\u308a\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u305d\u306e\u7d50\u679c\u3092\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306e\u884c\u3092\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306b\u30de\u30c3\u30d7\u3059\u308b Person \u3068\u3044\u3046\u30b1\u30fc\u30b9 \u30af\u30e9\u30b9\u3092\u5b9a\u7fa9\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001as \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092 Person \u578b\u306e\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306b\u5909\u63db\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001printSchema \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306e\u30b9\u30ad\u30fc\u30de\u60c5\u5831\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001filter \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 30 \u5e74\u4ee5\u4e0a\u524d\u306e\u884c\u3092\u9664\u5916\u3057\u3001show \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30d5\u30a3\u30eb\u30bf\u30fc\u7d50\u679c\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001groupBy \u30e1\u30bd\u30c3\u30c9\u3068 count \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u6027\u5225\u5225\u306b\u30b0\u30eb\u30fc\u30d7\u5316\u3057\u3001\u5404\u30b0\u30eb\u30fc\u30d7\u306e\u884c\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3057\u3001show \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30b0\u30eb\u30fc\u30d7\u5316\u7d50\u679c\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Python \u3067\u306f\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306e API \u306f Scala \u306e API \u3068\u306f\u5c11\u3057\u7570\u306a\u308a\u307e\u3059\u304c\u3001\u57fa\u672c\u7684\u306a\u6982\u5ff5\u306f\u540c\u3058\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u306e\u30b3\u30a2\u30b3\u30f3\u30bb\u30d7\u30c8\u3067\u3042\u308b\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u304cCSV\u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u66f8\u304d\u3059\u308b\u65b9\u6cd5\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u306e\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0 API \u306b\u306f\u3001CSV \u30d5\u30a1\u30a4\u30eb\u306e\u8aad\u307f\u53d6\u308a\u3068\u66f8\u304d\u8fbc\u307f\u306e\u305f\u3081\u306e\u4e00\u9023\u306e\u30e1\u30bd\u30c3\u30c9\u304c\u7528\u610f\u3055\u308c\u3066\u304a\u308a\u3001CSV \u30c7\u30fc\u30bf\u3092\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u7c21\u5358\u306b\u8aad\u307f\u8fbc\u307f\u3001\u305d\u306e\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092 CSV \u5f62\u5f0f\u3067\u4fdd\u5b58\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0 API \u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u66f8\u304d\u3059\u308b\u65b9\u6cd5\u3092\u793a\u3059\u30b5\u30f3\u30d7\u30eb \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}\n\nobject CsvExample {\n  def main(args: Array[String]): Unit = {\n    val spark = SparkSession.builder()\n      .appName(\"CsvExample\")\n      .master(\"local[*]\")\n      .getOrCreate()\n\n    \/\/ \u8bfb\u53d6CSV\u6587\u4ef6\u5230DataFrame\n    val df: DataFrame = spark.read\n      .format(\"csv\")\n      .option(\"header\", \"true\")\n      .option(\"inferSchema\", \"true\")\n      .load(\"path\/to\/file.csv\")\n\n    \/\/ \u663e\u793aDataFrame\n    df.show()\n\n    \/\/ \u5c06DataFrame\u4fdd\u5b58\u4e3aCSV\u6587\u4ef6\n    df.write\n      .format(\"csv\")\n      .mode(SaveMode.Overwrite)\n      .option(\"header\", \"true\")\n      .save(\"path\/to\/save\")\n\n    spark.stop()\n  }\n}\n<\/code><\/pre>\n\n\n\n<p>\u4e0a\u8a18\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001SparkSession API \u3092\u4f7f\u7528\u3057\u3066 SparkSession \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u4f5c\u6210\u3057\u3001read \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u305d\u306e\u7d50\u679c\u3092\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001show \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306e\u5185\u5bb9\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001write \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092 CSV \u30d5\u30a1\u30a4\u30eb\u3068\u3057\u3066\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u3053\u306e\u4f8b\u3067\u306f\u3001SaveMode.Overwrite \u30aa\u30d7\u30b7\u30e7\u30f3\u3092\u4f7f\u7528\u3057\u307e\u3057\u305f\u304c\u3001\u3053\u308c\u306f\u3001\u30d5\u30a1\u30a4\u30eb\u304c\u65e2\u306b\u5b58\u5728\u3059\u308b\u5834\u5408\u306f\u3001\u65e2\u5b58\u306e\u30d5\u30a1\u30a4\u30eb\u304c\u4e0a\u66f8\u304d\u3055\u308c\u308b\u3053\u3068\u3092\u610f\u5473\u3057\u307e\u3059\u3002 [\u4fdd\u5b58\u30e2\u30fc\u30c9.\u8ffd\u52a0] \u30aa\u30d7\u30b7\u30e7\u30f3\u3092\u4f7f\u7528\u3057\u3066\u3001\u65e2\u5b58\u306e\u30d5\u30a1\u30a4\u30eb\u306b\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092\u8ffd\u52a0\u3059\u308b\u3053\u3068\u3082\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Python \u3067\u306f\u3001CSV \u30d5\u30a1\u30a4\u30eb\u306e\u8aad\u307f\u53d6\u308a\u3068\u66f8\u304d\u8fbc\u307f\u306f Scala \u3068\u306f\u5c11\u3057\u7570\u306a\u308b API \u3092\u4f7f\u7528\u3057\u307e\u3059\u304c\u3001\u57fa\u672c\u7684\u306a\u6982\u5ff5\u306f\u540c\u3058\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u306e\u30b3\u30a2\u30b3\u30f3\u30bb\u30d7\u30c8\u3067\u3042\u308bDataFrame\u304c\u5bc4\u6728\u7d30\u5de5\u306e\u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u66f8\u304d\u3059\u308b\u65b9\u6cd5\u3092\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u306e\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0 API \u306b\u306f\u3001Parquet \u30d5\u30a1\u30a4\u30eb\u306e\u8aad\u307f\u53d6\u308a\u3068\u66f8\u304d\u8fbc\u307f\u3092\u884c\u3046\u305f\u3081\u306e\u4e00\u9023\u306e\u30e1\u30bd\u30c3\u30c9\u304c\u7528\u610f\u3055\u308c\u3066\u304a\u308a\u3001Parquet \u30c7\u30fc\u30bf\u3092 DataFrame \u306b\u7c21\u5358\u306b\u8aad\u307f\u8fbc\u307f\u3001Parquet \u5f62\u5f0f\u3067\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092\u4fdd\u5b58\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark \u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0 API \u3092\u4f7f\u7528\u3057\u3066\u30d1\u30fc\u30b1\u30c3\u30c8 \u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u66f8\u304d\u3059\u308b\u65b9\u6cd5\u3092\u793a\u3059\u30b5\u30f3\u30d7\u30eb \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}\n\nobject ParquetExample {\n  def main(args: Array[String]): Unit = {\n    val spark = SparkSession.builder()\n      .appName(\"ParquetExample\")\n      .master(\"local[*]\")\n      .getOrCreate()\n\n    \/\/ \u8bfb\u53d6Parquet\u6587\u4ef6\u5230DataFrame\n    val df: DataFrame = spark.read\n      .parquet(\"path\/to\/file.parquet\")\n\n    \/\/ \u663e\u793aDataFrame\n    df.show()\n\n    \/\/ \u5c06DataFrame\u4fdd\u5b58\u4e3aParquet\u6587\u4ef6\n    df.write\n      .mode(SaveMode.Overwrite)\n      .parquet(\"path\/to\/save\")\n\n    spark.stop()\n  }\n}\n<\/code><\/pre>\n\n\n\n<p>\u4e0a\u8a18\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001SparkSession API \u3092\u4f7f\u7528\u3057\u3066 SparkSession \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u4f5c\u6210\u3057\u3001\u8aad\u307f\u53d6\u308a\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 Parquet \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u305d\u306e\u7d50\u679c\u3092\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001show \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306e\u5185\u5bb9\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001write \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092\u30d1\u30fc\u30b1\u30c3\u30c8 \u30d5\u30a1\u30a4\u30eb\u3068\u3057\u3066\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u3053\u306e\u4f8b\u3067\u306f\u3001SaveMode.Overwrite \u30aa\u30d7\u30b7\u30e7\u30f3\u3092\u4f7f\u7528\u3057\u307e\u3057\u305f\u304c\u3001\u3053\u308c\u306f\u3001\u30d5\u30a1\u30a4\u30eb\u304c\u65e2\u306b\u5b58\u5728\u3059\u308b\u5834\u5408\u306f\u3001\u65e2\u5b58\u306e\u30d5\u30a1\u30a4\u30eb\u304c\u4e0a\u66f8\u304d\u3055\u308c\u308b\u3053\u3068\u3092\u610f\u5473\u3057\u307e\u3059\u3002 [\u4fdd\u5b58\u30e2\u30fc\u30c9.\u8ffd\u52a0] \u30aa\u30d7\u30b7\u30e7\u30f3\u3092\u4f7f\u7528\u3057\u3066\u3001\u65e2\u5b58\u306e\u30d5\u30a1\u30a4\u30eb\u306b\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u3092\u8ffd\u52a0\u3059\u308b\u3053\u3068\u3082\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Python \u3067\u306f\u3001Parquet \u30d5\u30a1\u30a4\u30eb\u306e\u8aad\u307f\u53d6\u308a\u3068\u66f8\u304d\u8fbc\u307f\u306f Scala \u3068\u306f\u5c11\u3057\u7570\u306a\u308b API \u3092\u4f7f\u7528\u3057\u307e\u3059\u304c\u3001\u57fa\u672c\u7684\u306a\u6982\u5ff5\u306f\u540c\u3058\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>\u30b9\u30d1\u30fc\u30af\u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306f\u3001\u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0 \u30c7\u30fc\u30bf\u3092\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u3067\u51e6\u7406\u3059\u308b Spark \u306e\u30b3\u30f3\u30dd\u30fc\u30cd\u30f3\u30c8\u3067\u3059\u3002 Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u3092\u4f7f\u7528\u3059\u308b\u3068\u3001\u9ad8\u5ea6\u306a\u62bd\u8c61\u5316\u3092\u4f7f\u7528\u3057\u3066\u3001DStream (\u500b\u5225\u306e\u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0) \u3084\u30a6\u30a3\u30f3\u30c9\u30a6\u64cd\u4f5c\u306a\u3069\u306e\u30c7\u30fc\u30bf \u30b9\u30c8\u30ea\u30fc\u30e0\u3092\u51e6\u7406\u3067\u304d\u307e\u3059\u3002 \u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u30c7\u30fc\u30bf\u30b9\u30c8\u30ea\u30fc\u30e0\u3092\u4e00\u9023\u306e\u5c0f\u3055\u306a\u30d0\u30c3\u30c1\u306b\u5206\u5272\u3057\u3001\u5404\u30d0\u30c3\u30c1\u3092\u51e6\u7406\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306e\u4e3b\u8981\u306a\u6982\u5ff5\u306e\u4e00\u90e8\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>DStream: DStream (\u30c7\u30a3\u30b9\u30af\u30ea\u30fc\u30c8 \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0) \u306f\u3001\u4e00\u9023\u306e RDD \u3068\u540c\u69d8\u306b\u3001\u30c7\u30fc\u30bf\u306e\u9023\u7d9a\u30b9\u30c8\u30ea\u30fc\u30e0\u3092\u8868\u3059 Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306e\u30b3\u30a2\u6982\u5ff5\u306e 1 \u3064\u3067\u3059\u3002 \u5404 DStream \u306f 1 \u3064\u4ee5\u4e0a\u306e\u30d0\u30c3\u30c1\u3067\u69cb\u6210\u3055\u308c\u3001\u305d\u308c\u305e\u308c\u306b\u4e00\u5b9a\u91cf\u306e\u30c7\u30fc\u30bf\u304c\u542b\u307e\u308c\u3066\u3044\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u5165\u529b\u30bd\u30fc\u30b9: \u5165\u529b\u30bd\u30fc\u30b9\u306f\u3001Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306e\u30c7\u30fc\u30bf \u30bd\u30fc\u30b9\u3067\u3059\u3002 Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306f\u3001Kafka\u3001Flume\u3001TCP \u30bd\u30b1\u30c3\u30c8\u306a\u3069\u3001\u8907\u6570\u306e\u5165\u529b\u30bd\u30fc\u30b9\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u5909\u63db\u64cd\u4f5c: \u5909\u63db\u64cd\u4f5c\u306f\u3001DStream API \u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u3092\u5909\u63db\u304a\u3088\u3073\u51e6\u7406\u3067\u304d\u308b Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306e\u30b3\u30a2\u64cd\u4f5c\u306e 1 \u3064\u3067\u3059\u3002 \u305f\u3068\u3048\u3070\u3001\u30de\u30c3\u30d7\u3001\u30d5\u30a3\u30eb\u30bf\u30fc\u3001\u524a\u6e1b\u306a\u3069\u306e\u64cd\u4f5c\u3067\u3059\u3002<\/li>\n\n\n\n<li>\u51fa\u529b\u64cd\u4f5c: \u51fa\u529b\u64cd\u4f5c\u3092\u4f7f\u7528\u3059\u308b\u3068\u3001\u51e6\u7406\u3055\u308c\u305f\u30c7\u30fc\u30bf\u3092 Hadoop \u5206\u6563\u30d5\u30a1\u30a4\u30eb \u30b7\u30b9\u30c6\u30e0 (HDFS) \u3084 Apache Kafka \u306a\u3069\u306e\u5916\u90e8\u30b7\u30b9\u30c6\u30e0\u306b\u9001\u4fe1\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30a6\u30a3\u30f3\u30c9\u30a6\u30a2\u30af\u30b7\u30e7\u30f3: \u30a6\u30a3\u30f3\u30c9\u30a6\u30a2\u30af\u30b7\u30e7\u30f3\u3092\u4f7f\u7528\u3059\u308b\u3068\u3001\u9023\u7d9a\u3059\u308b\u30c7\u30fc\u30bf\u30b9\u30c8\u30ea\u30fc\u30e0\u306b\u5bfe\u3057\u3066\u30b9\u30e9\u30a4\u30c7\u30a3\u30f3\u30b0\u30a6\u30a3\u30f3\u30c9\u30a6\u64cd\u4f5c\u3092\u5b9f\u884c\u3067\u304d\u307e\u3059\u3002 \u3053\u308c\u306b\u3088\u308a\u3001\u6700\u8fd1\u306e\u30c7\u30fc\u30bf\u306b\u5bfe\u3057\u3066\u96c6\u8a08\u64cd\u4f5c\u3092\u5b9f\u884c\u3057\u3001\u7d50\u679c\u3092\u51e6\u7406\u3067\u304d\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306f\u30d0\u30c3\u30c1\u30d9\u30fc\u30b9\u3067\u3001\u5404\u30d0\u30c3\u30c1\u306b\u306f\u4e00\u5b9a\u91cf\u306e\u30c7\u30fc\u30bf\u304c\u542b\u307e\u308c\u3066\u3044\u307e\u3059\u3002 Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u3067\u306f\u3001\u5404\u30d0\u30c3\u30c1\u306f RDD \u3068\u3057\u3066\u6271\u308f\u308c\u3001Spark \u306e\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0 \u30a8\u30f3\u30b8\u30f3\u3092\u4f7f\u7528\u3057\u3066\u51e6\u7406\u3055\u308c\u307e\u3059\u3002 \u3053\u306e\u30a2\u30d7\u30ed\u30fc\u30c1\u306b\u3088\u308a\u3001Spark Streaming \u306f\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u3092\u51e6\u7406\u3057\u3001\u9ad8\u901f\u3067\u52b9\u7387\u7684\u306a\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u5206\u6790\u3092\u5b9f\u73fe\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u3092\u4f7f\u7528\u3057\u3066 TCP \u30bd\u30b1\u30c3\u30c8\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u3044\u304f\u3064\u304b\u306e\u3053\u3068\u3092\u884c\u3046\u5358\u7d14\u306a Scala \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>import org.apache.spark.streaming.{Seconds, StreamingContext}\nimport org.apache.spark.SparkConf\n\nobject StreamingExample {\n  def main(args: Array[String]): Unit = {\n    val conf = new SparkConf().setAppName(\"StreamingExample\").setMaster(\"local[*]\")\n    val ssc = new StreamingContext(conf, Seconds(1))\n\n    val lines = ssc.socketTextStream(\"localhost\", 9999)\n    val words = lines.flatMap(_.split(\" \"))\n    val wordCounts = words.map((_, 1)).reduceByKey(_ + _)\n    wordCounts.print()\n\n    ssc.start()\n    ssc.awaitTermination()\n  }\n}\n<\/code><\/pre>\n\n\n\n<p>\u3053\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001SparkConf API \u3068 StreamingContext API \u3092\u4f7f\u7528\u3057\u3066 StreamingContext \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u4f5c\u6210\u3057\u3001socketTextStream \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30cd\u30a4\u30c6\u30a3\u30d6 TCP \u30bd\u30b1\u30c3\u30c8\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u307e\u3059\u3002 \u6b21\u306b\u3001flatMap \u3084 map \u306a\u3069\u306e\u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u3092\u5909\u63db\u304a\u3088\u3073\u51e6\u7406\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001print \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u7d50\u679c\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Python \u3067\u306f\u3001Spark \u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u306e API \u306f Scala \u306e API \u3068\u306f\u5c11\u3057\u7570\u306a\u308a\u307e\u3059\u304c\u3001\u57fa\u672c\u7684\u306a\u6982\u5ff5\u306f\u540c\u3058\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>\u30b9\u30d1\u30fc\u30af MLlib \u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark MLlib \u306f\u3001\u5927\u898f\u6a21\u306a\u6a5f\u68b0\u5b66\u7fd2\u306e\u554f\u984c\u3092\u51e6\u7406\u3059\u308b\u305f\u3081\u306e\u4e00\u9023\u306e\u30c4\u30fc\u30eb\u3068\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u63d0\u4f9b\u3059\u308b Spark \u306e\u6a5f\u68b0\u5b66\u7fd2\u30e9\u30a4\u30d6\u30e9\u30ea\u3067\u3059\u3002 Spark MLlib \u306e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u306f\u5206\u6563\u74b0\u5883\u3067\u5b9f\u884c\u3067\u304d\u308b\u305f\u3081\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u7c21\u5358\u306b\u51e6\u7406\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark MLlib \u306e\u4e3b\u8981\u306a\u6982\u5ff5\u306e\u4e00\u90e8\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30c7\u30fc\u30bf\u578b: Spark MLlib \u3067\u306f\u3001\u30d9\u30af\u30bf\u30fc\u3001\u30e9\u30d9\u30eb\u3001\u30b5\u30f3\u30d7\u30eb\u306a\u3069\u3001\u591a\u304f\u306e\u30c7\u30fc\u30bf\u578b\u304c\u30b5\u30dd\u30fc\u30c8\u3055\u308c\u3066\u3044\u307e\u3059\u3002 \u30d9\u30af\u30c8\u30eb\u306f\u6570\u5024\u306e\u30bb\u30c3\u30c8\u3092\u542b\u3080\u30d9\u30af\u30c8\u30eb\u3092\u53c2\u7167\u3057\u3001\u30e9\u30d9\u30eb\u306f\u5206\u985e\u554f\u984c\u306e\u30af\u30e9\u30b9\u3092\u53c2\u7167\u3057\u3001\u30b5\u30f3\u30d7\u30eb\u306f\u30e9\u30d9\u30eb\u306e\u30bb\u30c3\u30c8\u3068\u5bfe\u5fdc\u3059\u308b\u7279\u5fb4\u30d9\u30af\u30c8\u30eb\u3092\u53c2\u7167\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u7279\u5fb4\u62bd\u51fa: \u7279\u5fb4\u62bd\u51fa\u306f Spark MLlib \u306e\u4e3b\u8981\u306a\u6982\u5ff5\u306e 2 \u3064\u3067\u3042\u308a\u3001\u751f\u30c7\u30fc\u30bf\u304b\u3089\u610f\u5473\u306e\u3042\u308b\u7279\u5fb4\u3092\u62bd\u51fa\u3067\u304d\u307e\u3059\u3002 Spark MLlib \u306b\u306f\u3001TF-IDF \u3084 Word&lt;&gt;Vec \u306a\u3069\u306e\u4e00\u9023\u306e\u7279\u5fb4\u62bd\u51fa\u30c4\u30fc\u30eb\u304c\u7528\u610f\u3055\u308c\u3066\u3044\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30e2\u30c7\u30eb \u30c8\u30ec\u30fc\u30cb\u30f3\u30b0: Spark MLlib \u306b\u306f\u3001\u7dda\u5f62\u56de\u5e30\u3001\u30ed\u30b8\u30b9\u30c6\u30a3\u30c3\u30af\u56de\u5e30\u3001\u30c7\u30b7\u30b8\u30e7\u30f3 \u30c4\u30ea\u30fc\u3001\u30b5\u30dd\u30fc\u30c8 \u30d9\u30af\u30bf\u30fc \u30de\u30b7\u30f3\u306a\u3069\u3001\u591a\u304f\u306e\u6a5f\u68b0\u5b66\u7fd2\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u304c\u542b\u307e\u308c\u3066\u3044\u307e\u3059\u3002 \u3053\u308c\u3089\u306e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u306f\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u51e6\u7406\u3059\u308b\u305f\u3081\u306b\u5206\u6563\u74b0\u5883\u3067\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30e2\u30c7\u30eb\u306e\u8a55\u4fa1: Spark MLlib \u306b\u306f\u3001\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u6e08\u307f\u30e2\u30c7\u30eb\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u8a55\u4fa1\u3059\u308b\u305f\u3081\u306e\u4e00\u9023\u306e\u8a55\u4fa1\u30c4\u30fc\u30eb\u304c\u7528\u610f\u3055\u308c\u3066\u3044\u307e\u3059\u3002 \u305f\u3068\u3048\u3070\u3001\u4ea4\u5dee\u691c\u8a3c\u3084 ROC \u66f2\u7dda\u306a\u3069\u306e\u30c4\u30fc\u30eb\u3092\u4f7f\u7528\u3057\u3066\u3001\u5206\u985e\u5668\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u8a55\u4fa1\u3067\u304d\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Spark MLlib \u306e\u4e3b\u306a\u76ee\u7684\u306f\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3068\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u306a\u304c\u3089\u3001\u4f7f\u3044\u3084\u3059\u3044\u6a5f\u68b0\u5b66\u7fd2\u30c4\u30fc\u30eb\u3092\u63d0\u4f9b\u3059\u308b\u3053\u3068\u3067\u3059\u3002 Spark MLlib \u306e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3068\u30c4\u30fc\u30eb\u306f\u3001RDD \u3084 DataFrame \u306a\u3069\u306e Spark \u306e\u30b3\u30a2\u6982\u5ff5\u3092\u4f7f\u7528\u3057\u3066\u5b9f\u88c5\u3055\u308c\u3066\u3044\u308b\u305f\u3081\u3001\u7d71\u4e00\u3055\u308c\u305f\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u74b0\u5883\u3067\u6a5f\u68b0\u5b66\u7fd2\u306e\u554f\u984c\u3084\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306b\u53d6\u308a\u7d44\u3080\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark MLlib \u306e\u30ed\u30b8\u30b9\u30c6\u30a3\u30c3\u30af\u56de\u5e30\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u3092\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u3059\u308b\u5358\u7d14\u306a Scala \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>import org.apache.spark.ml.classification.LogisticRegression\nimport org.apache.spark.ml.evaluation.BinaryClassificationEvaluator\nimport org.apache.spark.ml.feature.VectorAssembler\nimport org.apache.spark.sql.SparkSession\n\nobject MLlibExample {\n  def main(args: Array[String]): Unit = {\n    val spark = SparkSession.builder()\n      .appName(\"MLlibExample\")\n      .master(\"local[*]\")\n      .getOrCreate()\n\n    \/\/ \u8bfb\u53d6\u6570\u636e\u96c6\n    val data = spark.read\n      .option(\"header\", \"true\")\n      .option(\"inferSchema\", \"true\")\n      .csv(\"path\/to\/data.csv\")\n\n    \/\/ \u5408\u5e76\u7279\u5f81\u5217\n    val assembler = new VectorAssembler()\n      .setInputCols(Array(\"col1\", \"col2\", \"col3\"))\n      .setOutputCol(\"features\")\n\n    val df = assembler.transform(data).select(\"features\", \"label\")\n\n    \/\/ \u5212\u5206\u6570\u636e\u96c6\n    val Array(trainingData, testData) = df.randomSplit(Array(0.7, 0.3), seed = 1234)\n\n    \/\/ \u8bad\u7ec3\u903b\u8f91\u56de\u5f52\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark GraphX\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u6559\u3048\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark GraphX \u306f\u3001\u5927\u898f\u6a21\u306a\u30b0\u30e9\u30d5 \u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u64cd\u4f5c\u3059\u308b\u305f\u3081\u306e\u4e00\u9023\u306e\u30c4\u30fc\u30eb\u3068\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u63d0\u4f9b\u3059\u308b Spark \u306e\u30b0\u30e9\u30d5\u51e6\u7406\u30e9\u30a4\u30d6\u30e9\u30ea\u3067\u3059\u3002 Spark GraphX \u306e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u306f\u5206\u6563\u74b0\u5883\u3067\u5b9f\u884c\u3067\u304d\u308b\u305f\u3081\u3001\u5927\u898f\u6a21\u306a\u30b0\u30e9\u30d5 \u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u7c21\u5358\u306b\u64cd\u4f5c\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark GraphX \u306e\u4e3b\u8981\u306a\u6982\u5ff5\u306e\u4e00\u90e8\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30b0\u30e9\u30d5: \u30b0\u30e9\u30d5\u306f Spark GraphX \u306e\u4e2d\u6838\u3068\u306a\u308b\u6982\u5ff5\u306e 1 \u3064\u3067\u3042\u308a\u3001\u9802\u70b9\u306e\u30bb\u30c3\u30c8\u3068\u30a8\u30c3\u30b8\u306e\u30bb\u30c3\u30c8\u3067\u69cb\u6210\u3055\u308c\u307e\u3059\u3002 \u5404\u9802\u70b9\u306b\u306f\u4e00\u610f\u306e\u8b58\u5225\u5b50\u304c\u3042\u308a\u3001\u5404\u30a8\u30c3\u30b8\u306f 2 \u3064\u306e\u9802\u70b9\u3092\u63a5\u7d9a\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30d7\u30ed\u30d1\u30c6\u30a3 \u30b0\u30e9\u30d5: \u30d7\u30ed\u30d1\u30c6\u30a3 \u30b0\u30e9\u30d5\u306f\u3001\u5404\u9802\u70b9\u3068\u5404\u30a8\u30c3\u30b8\u304c\u5c5e\u6027\u5024\u3092\u6301\u3064\u3053\u3068\u304c\u3067\u304d\u308b\u62e1\u5f35\u30b0\u30e9\u30d5\u3067\u3059\u3002 \u3053\u308c\u306b\u3088\u308a\u3001\u30e6\u30fc\u30b6\u30fc\u3068\u88fd\u54c1\u306e\u95a2\u4fc2\u306a\u3069\u3001\u3088\u308a\u591a\u304f\u306e\u60c5\u5831\u3092\u30b0\u30e9\u30d5\u306b\u4fdd\u5b58\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u5909\u63db\u64cd\u4f5c: \u5909\u63db\u64cd\u4f5c\u306f Spark GraphX \u306e\u30b3\u30a2\u64cd\u4f5c\u306e 1 \u3064\u3067\u3042\u308a\u3001GraphX API \u3092\u4f7f\u7528\u3057\u3066\u30b0\u30e9\u30d5\u3092\u5909\u63db\u304a\u3088\u3073\u51e6\u7406\u3067\u304d\u307e\u3059\u3002 \u305f\u3068\u3048\u3070\u3001\u30de\u30c3\u30d7\u9802\u70b9\u3001\u30de\u30c3\u30d7\u30a8\u30c3\u30b8\u3001\u30b5\u30d6\u30b0\u30e9\u30d5\u306a\u3069\u306e\u64cd\u4f5c\u306a\u3069\u3067\u3059\u3002<\/li>\n\n\n\n<li>\u8a08\u7b97\u64cd\u4f5c: \u8a08\u7b97\u64cd\u4f5c\u306f Spark GraphX \u306e\u30b3\u30a2\u64cd\u4f5c\u306e 1 \u3064\u3067\u3042\u308a\u3001PageRank\u3001\u30b0\u30e9\u30d5 \u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u3001\u63a5\u7d9a\u6027\u5206\u6790\u306a\u3069\u306e\u30b0\u30e9\u30d5\u306b\u5bfe\u3057\u3066\u3044\u304f\u3064\u304b\u306e\u8a08\u7b97\u3092\u5b9f\u884c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30d3\u30b8\u30e5\u30a2\u30e9\u30a4\u30bc\u30fc\u30b7\u30e7\u30f3: Spark GraphX \u306b\u306f\u3001\u30b0\u30e9\u30d5 \u30c7\u30fc\u30bf\u306e\u30d3\u30b8\u30e5\u30a2\u30e9\u30a4\u30bc\u30fc\u30b7\u30e7\u30f3\u306b\u5f79\u7acb\u3064\u30d3\u30b8\u30e5\u30a2\u30e9\u30a4\u30bc\u30fc\u30b7\u30e7\u30f3 \u30c4\u30fc\u30eb\u304c\u3044\u304f\u3064\u304b\u7528\u610f\u3055\u308c\u3066\u3044\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Spark GraphX \u306e\u4e3b\u306a\u76ee\u7684\u306f\u3001\u5927\u898f\u6a21\u306a\u30b0\u30e9\u30d5 \u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3068\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u306a\u304c\u3089\u3001\u4f7f\u3044\u3084\u3059\u3044\u30b0\u30e9\u30d5\u51e6\u7406\u30c4\u30fc\u30eb\u3092\u63d0\u4f9b\u3059\u308b\u3053\u3068\u3067\u3059\u3002 Spark GraphX \u306e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3068\u30c4\u30fc\u30eb\u306f\u3001RDD \u3084 DataFrame \u306a\u3069\u306e Spark \u306e\u30b3\u30a2\u6982\u5ff5\u3092\u4f7f\u7528\u3057\u3066\u5b9f\u88c5\u3055\u308c\u3066\u3044\u308b\u305f\u3081\u3001\u7d71\u4e00\u3055\u308c\u305f\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u74b0\u5883\u3067\u30b0\u30e9\u30d5\u306e\u554f\u984c\u3084\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306b\u53d6\u308a\u7d44\u3080\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark GraphX \u3092\u4f7f\u7528\u3057\u3066\u30b0\u30e9\u30d5\u5185\u306e\u5404\u9802\u70b9\u306e PageRank \u5024\u3092\u8a08\u7b97\u3059\u308b\u5358\u7d14\u306a Scala \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>import org.apache.spark.graphx.GraphLoader\nimport org.apache.spark.sql.SparkSession\n\nobject GraphXExample {\n  def main(args: Array[String]): Unit = {\n    val spark = SparkSession.builder()\n      .appName(\"GraphXExample\")\n      .master(\"local[*]\")\n      .getOrCreate()\n\n    \/\/ \u8bfb\u53d6\u56fe\u6570\u636e\n    val graph = GraphLoader.edgeListFile(spark.sparkContext, \"path\/to\/graph.txt\")\n\n    \/\/ \u8ba1\u7b97PageRank\u503c\n    val ranks = graph.pageRank(0.0001).vertices\n\n    \/\/ \u663e\u793a\u524d10\u4e2aPageRank\u503c\n    ranks.sortBy(_._2, ascending = false).take(10).foreach(println)\n\n    spark.stop()\n  }\n}\n<\/code><\/pre>\n\n\n\n<p>\u3053\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001GraphLoader API \u3092\u4f7f\u7528\u3057\u3066\u3001\u30a8\u30c3\u30b8 \u30ea\u30b9\u30c8 \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30b0\u30e9\u30d5 \u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u307e\u3059\u3002 \u6b21\u306b\u3001pageRank \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u5404\u9802\u70b9\u306e PageRank \u5024\u3092\u8a08\u7b97\u3057\u3001\u305d\u306e\u7d50\u679c\u3092 ranks \u5909\u6570\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001sortBy \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 PageRank \u5024\u3092\u4e26\u3079\u66ff\u3048\u3001take \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u4e0a\u4f4d 10 \u500b\u306e PageRank \u5024\u3092\u8868\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Python \u3067\u306f\u3001Spark GraphX \u306e API \u306f Scala \u306e API \u3068\u306f\u5c11\u3057\u7570\u306a\u308a\u307e\u3059\u304c\u3001\u57fa\u672c\u7684\u306a\u6982\u5ff5\u306f\u540c\u3058\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9eScala\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Scala\u306f\u9ad8\u6c34\u6e96\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3042\u308a\u3001Spark\u304c\u63a8\u5968\u3059\u308b\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u306e1\u3064\u3067\u3059\u3002 Scala\u306e\u69cb\u6587\u306f\u7c21\u6f54\u3067\u8aad\u307f\u66f8\u304d\u304c\u7c21\u5358\u3067\u3001\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u6307\u5411\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u3068\u95a2\u6570\u578b\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u306e\u4e21\u65b9\u3092\u5099\u3048\u3066\u3044\u307e\u3059\u3002 Scala\u306fJava\u8a00\u8a9e\u3068\u306e\u4e92\u63db\u6027\u304c\u9ad8\u3044\u305f\u3081\u3001Java\u30e9\u30a4\u30d6\u30e9\u30ea\u3084\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3068\u30b7\u30fc\u30e0\u30ec\u30b9\u306b\u7d71\u5408\u3055\u308c\u307e\u3059\u3002<\/p>\n\n\n\n<p>Scala \u8a00\u8a9e\u306e\u30b3\u30a2\u6982\u5ff5\u306e\u3044\u304f\u3064\u304b\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30af\u30e9\u30b9\u3068\u30aa\u30d6\u30b8\u30a7\u30af\u30c8: Scala \u306f\u3001\u30af\u30e9\u30b9\u3068\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306e\u5b9a\u7fa9\u3092\u30b5\u30dd\u30fc\u30c8\u3059\u308b\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u6307\u5411\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3059\u3002 \u30af\u30e9\u30b9\u306f\u3001\u985e\u4f3c\u3057\u305f\u30d7\u30ed\u30d1\u30c6\u30a3\u3068\u30e1\u30bd\u30c3\u30c9\u3092\u6301\u3064\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306e\u30bb\u30c3\u30c8\u3067\u3042\u308a\u3001\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306f\u30af\u30e9\u30b9\u306e\u30a4\u30f3\u30b9\u30bf\u30f3\u30b9\u3067\u3059\u3002<\/li>\n\n\n\n<li>\u95a2\u6570\u3068\u30af\u30ed\u30fc\u30b8\u30e3: Scala \u306f\u3001\u95a2\u6570\u3084\u30af\u30ed\u30fc\u30b8\u30e3\u306a\u3069\u306e\u95a2\u6570\u578b\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u306e\u6a5f\u80fd\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002 \u95a2\u6570\u306f\u7279\u5b9a\u306e\u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3059\u308b\u30b9\u30c6\u30fc\u30c8\u30e1\u30f3\u30c8\u306e\u30bb\u30c3\u30c8\u3067\u3042\u308a\u3001\u30af\u30ed\u30fc\u30b8\u30e3\u306f\u95a2\u6570\u3068\u305d\u308c\u304c\u53c2\u7167\u3059\u308b\u5909\u6570\u306e\u30bb\u30c3\u30c8\u3067\u3059\u3002<\/li>\n\n\n\n<li>\u30d1\u30bf\u30fc\u30f3\u30de\u30c3\u30c1\u30f3\u30b0:Scala\u306f\u30d1\u30bf\u30fc\u30f3\u30de\u30c3\u30c1\u30f3\u30b0\u6a5f\u80fd\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u30c7\u30fc\u30bf\u578b\u3068\u69cb\u9020\u3092\u8abf\u3079\u3001\u3055\u307e\u3056\u307e\u306a\u72b6\u6cc1\u306b\u5fdc\u3058\u3066\u3055\u307e\u3056\u307e\u306a\u30a2\u30af\u30b7\u30e7\u30f3\u3092\u5b9f\u884c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u9ad8\u968e\u95a2\u6570: Scala \u306f\u9ad8\u968e\u95a2\u6570\u306e\u6a5f\u80fd\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u95a2\u6570\u3092\u5f15\u6570\u3068\u3057\u3066\u4ed6\u306e\u95a2\u6570\u306b\u6e21\u3057\u305f\u308a\u3001\u4ed6\u306e\u95a2\u6570\u304b\u3089\u95a2\u6570\u3092\u8fd4\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u4e0d\u5909\u6027: Scala \u306f\u4e0d\u5909\u6027\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u540c\u6642\u5b9f\u884c\u306e\u554f\u984c\u3084\u30c7\u30fc\u30bf\u306e\u7af6\u5408\u3092\u56de\u907f\u3059\u308b\u305f\u3081\u306b\u4e0d\u5909\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306e\u4f7f\u7528\u3092\u5968\u52b1\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Scala\u306f\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3068\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u3092\u64cd\u4f5c\u3059\u308b\u305f\u3081\u306e\u5f37\u529b\u306a\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3059\u3002 Spark \u306e Scala API \u306f\u3001\u30af\u30e9\u30b9\u3001\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3001\u95a2\u6570\u306a\u3069\u306e Scala \u306e\u30b3\u30a2\u6982\u5ff5\u3092\u4f7f\u7528\u3057\u3066\u3001\u4f7f\u3044\u3084\u3059\u3044\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30a4\u30b9\u3068\u8c4a\u5bcc\u306a\u6a5f\u80fd\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark SQL API \u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u5404\u5358\u8a9e\u306e\u51fa\u73fe\u56de\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3059\u308b\u5358\u7d14\u306a Scala \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">scalaCopy code<code>import org.apache.spark.sql.SparkSession\n\nobject ScalaExample {\n  def main(args: Array[String]): Unit = {\n    val spark = SparkSession.builder()\n      .appName(\"ScalaExample\")\n      .master(\"local[*]\")\n      .getOrCreate()\n\n    \/\/ \u8bfb\u53d6CSV\u6587\u4ef6\u5230DataFrame\n    val df = spark.read\n      .option(\"header\", \"true\")\n      .csv(\"path\/to\/file.csv\")\n\n    \/\/ \u8ba1\u7b97\u6bcf\u4e2a\u5355\u8bcd\u7684\u51fa\u73b0\u6b21\u6570\n    val wordCounts = df.select(\"text\")\n      .rdd.flatMap(_.getString(0).split(\" \"))\n      .map(word =&gt; (word, 1))\n      .reduceByKey(_ + _)\n\n    \/\/ \u663e\u793a\u7ed3\u679c\n    wordCounts.foreach(println)\n\n    spark.stop()\n  }\n}\n<\/code><\/pre>\n\n\n\n<p>\u3053\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001SparkSession API \u3092\u4f7f\u7528\u3057\u3066 SparkSession \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u4f5c\u6210\u3057\u3001\u8aad\u307f\u53d6\u308a\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u305d\u306e\u7d50\u679c\u3092\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001flatMap \u3084 map \u306a\u3069\u306e\u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u3092\u5909\u63db\u304a\u3088\u3073\u51e6\u7406\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001reduceByKey \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u5404\u5358\u8a9e\u306e\u51fa\u73fe\u56de\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3057\u3001\u305d\u306e\u7d50\u679c\u3092 wordCounts \u5909\u6570\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001foreach \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u7d50\u679c\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Scala \u306e\u69cb\u6587\u306f\u7c21\u6f54\u3067\u8aad\u307f\u66f8\u304d\u304c\u7c21\u5358\u306a\u305f\u3081\u3001Spark \u3092\u4f7f\u7528\u3057\u305f\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u306b\u6700\u9069\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9eJava\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Java\u306f\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u6307\u5411\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3042\u308a\u3001Spark\u306e\u4ed6\u306e\u4e3b\u8981\u306a\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u306e1\u3064\u3067\u3059\u3002 Java\u306e\u69cb\u6587\u306f\u7c21\u6f54\u3067\u8aad\u307f\u66f8\u304d\u304c\u7c21\u5358\u3067\u3001\u5f37\u529b\u306a\u578b\u30c1\u30a7\u30c3\u30af\u3068\u8c4a\u5bcc\u306a\u30e9\u30a4\u30d6\u30e9\u30ea\u3068\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3092\u5099\u3048\u3066\u3044\u307e\u3059\u3002<\/p>\n\n\n\n<p>Java \u8a00\u8a9e\u306e\u4e3b\u8981\u306a\u6982\u5ff5\u306e\u4e00\u90e8\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30af\u30e9\u30b9\u3068\u30aa\u30d6\u30b8\u30a7\u30af\u30c8:Java\u306f\u3001\u30af\u30e9\u30b9\u3068\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306e\u5b9a\u7fa9\u3092\u30b5\u30dd\u30fc\u30c8\u3059\u308b\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u6307\u5411\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3059\u3002 \u30af\u30e9\u30b9\u306f\u3001\u985e\u4f3c\u3057\u305f\u30d7\u30ed\u30d1\u30c6\u30a3\u3068\u30e1\u30bd\u30c3\u30c9\u3092\u6301\u3064\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306e\u30bb\u30c3\u30c8\u3067\u3042\u308a\u3001\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306f\u30af\u30e9\u30b9\u306e\u30a4\u30f3\u30b9\u30bf\u30f3\u30b9\u3067\u3059\u3002<\/li>\n\n\n\n<li>\u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30a4\u30b9: Java \u306f\u3001\u4e00\u9023\u306e\u30e1\u30bd\u30c3\u30c9\u306e\u5ba3\u8a00\u3067\u3042\u308b\u304c\u5b9f\u88c5\u3055\u308c\u3066\u3044\u306a\u3044\u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30a4\u30b9\u306e\u5b9a\u7fa9\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30d1\u30c3\u30b1\u30fc\u30b8: Java \u306f\u30d1\u30c3\u30b1\u30fc\u30b8\u306e\u5b9a\u7fa9\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u95a2\u9023\u3059\u308b\u30af\u30e9\u30b9\u3068\u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30fc\u30b9\u3092\u307e\u3068\u3081\u3066\u6574\u7406\u3057\u3001\u3088\u308a\u512a\u308c\u305f\u540d\u524d\u7a7a\u9593\u7ba1\u7406\u3092\u63d0\u4f9b\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u4f8b\u5916\u51e6\u7406: Java \u306f\u4f8b\u5916\u51e6\u7406\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u30d7\u30ed\u30b0\u30e9\u30e0\u306e\u5b9f\u884c\u4e2d\u306b\u30a8\u30e9\u30fc\u3084\u4f8b\u5916\u72b6\u614b\u3092\u51e6\u7406\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30de\u30eb\u30c1\u30b9\u30ec\u30c3\u30c9:Java\u306f\u30de\u30eb\u30c1\u30b9\u30ec\u30c3\u30c9\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u306e\u6a5f\u80fd\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u8907\u6570\u306e\u30b9\u30ec\u30c3\u30c9\u3092\u540c\u6642\u306b\u5b9f\u884c\u3057\u3066\u30d7\u30ed\u30b0\u30e9\u30e0\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u5411\u4e0a\u3055\u305b\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Java\u306f\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3084\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u306e\u51e6\u7406\u306b\u4f7f\u7528\u3067\u304d\u308b\u4e00\u822c\u7684\u306a\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3059\u3002 Spark \u306e Java API \u306f\u3001\u30af\u30e9\u30b9\u3001\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3001\u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30a4\u30b9\u306a\u3069\u306e\u30b3\u30a2 Java \u6982\u5ff5\u3092\u4f7f\u7528\u3057\u3066\u3001\u4f7f\u3044\u3084\u3059\u3044\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0 \u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30a4\u30b9\u3068\u8c4a\u5bcc\u306a\u6a5f\u80fd\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark SQL API \u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u5404\u5358\u8a9e\u306e\u51fa\u73fe\u56de\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3059\u308b\u5358\u7d14\u306a Java \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">javaCopy code<code>import org.apache.spark.sql.Dataset;\nimport org.apache.spark.sql.Row;\nimport org.apache.spark.sql.SparkSession;\n\npublic class JavaExample {\n    public static void main(String[] args) {\n        SparkSession spark = SparkSession.builder()\n                .appName(\"JavaExample\")\n                .master(\"local[*]\")\n                .getOrCreate();\n\n        \/\/ \u8bfb\u53d6CSV\u6587\u4ef6\u5230Dataset\n        Dataset&lt;Row&gt; df = spark.read()\n                .option(\"header\", \"true\")\n                .csv(\"path\/to\/file.csv\");\n\n        \/\/ \u8ba1\u7b97\u6bcf\u4e2a\u5355\u8bcd\u7684\u51fa\u73b0\u6b21\u6570\n        Dataset&lt;Row&gt; wordCounts = df.select(\"text\")\n                .flatMap(row -&gt; Arrays.asList(row.getString(0).split(\" \")).iterator(), Encoders.STRING())\n                .groupBy(\"value\")\n                .count();\n\n        \/\/ \u663e\u793a\u7ed3\u679c\n        wordCounts.show();\n\n        spark.stop();\n    }\n}\n<\/code><\/pre>\n\n\n\n<p>\u3053\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001SparkSession API \u3092\u4f7f\u7528\u3057\u3066 SparkSession \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u4f5c\u6210\u3057\u3001\u8aad\u307f\u53d6\u308a\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u305d\u306e\u7d50\u679c\u3092\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001flatMap \u3084 groupBy \u306a\u3069\u306e\u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u3092\u5909\u63db\u304a\u3088\u3073\u51e6\u7406\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001show \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u7d50\u679c\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Java\u306e\u69cb\u6587\u306f\u7c21\u6f54\u3067\u8aad\u307f\u66f8\u304d\u304c\u7c21\u5358\u3067\u3001\u5f37\u529b\u306a\u578b\u30c1\u30a7\u30c3\u30af\u3068\u8c4a\u5bcc\u306a\u30e9\u30a4\u30d6\u30e9\u30ea\u3068\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3092\u5099\u3048\u3066\u3044\u307e\u3059\u3002 \u3053\u308c\u306b\u3088\u308a\u3001Spark\u3092\u4f7f\u7528\u3057\u305f\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u306b\u6700\u9069\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9ePython\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Python \u306f\u3001\u7fd2\u5f97\u3057\u3084\u3059\u3044\u9ad8\u30ec\u30d9\u30eb\u306e\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3042\u308a\u3001Spark \u306e\u4ed6\u306e\u4e3b\u8981\u306a\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u306e 1 \u3064\u3067\u3059\u3002 Python\u306e\u69cb\u6587\u306f\u7c21\u6f54\u3067\u3001\u8aad\u307f\u66f8\u304d\u304c\u7c21\u5358\u3067\u3001\u30e9\u30a4\u30d6\u30e9\u30ea\u3068\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u304c\u8c4a\u5bcc\u3067\u3059\u3002<\/p>\n\n\n\n<p>Python \u8a00\u8a9e\u306e\u4e3b\u8981\u306a\u6982\u5ff5\u306e\u4e00\u90e8\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u5909\u6570\u3068\u30c7\u30fc\u30bf\u578b: Python \u306f\u3001\u6587\u5b57\u5217\u3001\u6570\u5024\u3001\u30ea\u30b9\u30c8\u306a\u3069\u3001\u3055\u307e\u3056\u307e\u306a\u30c7\u30fc\u30bf\u578b\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002 \u5909\u6570\u306f\u3001\u3053\u308c\u3089\u306e\u30c7\u30fc\u30bf\u578b\u306e\u5024\u3092\u683c\u7d0d\u3059\u308b\u305f\u3081\u306b\u4f7f\u7528\u3055\u308c\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u95a2\u6570\u3068\u30e2\u30b8\u30e5\u30fc\u30eb: Python \u306f\u3001\u7279\u5b9a\u306e\u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3059\u308b\u4e00\u9023\u306e\u30b9\u30c6\u30fc\u30c8\u30e1\u30f3\u30c8\u3067\u3042\u308b\u95a2\u6570\u306e\u5b9a\u7fa9\u3068\u547c\u3073\u51fa\u3057\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002 Python\u306f\u3001\u95a2\u9023\u3059\u308b\u95a2\u6570\u3068\u5909\u6570\u306e\u30bb\u30c3\u30c8\u3067\u3042\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u306e\u30a4\u30f3\u30dd\u30fc\u30c8\u3068\u4f7f\u7528\u3082\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u6761\u4ef6\u30b9\u30c6\u30fc\u30c8\u30e1\u30f3\u30c8\u3068\u30eb\u30fc\u30d7\u30b9\u30c6\u30fc\u30c8\u30e1\u30f3\u30c8:Python\u306f\u6761\u4ef6\u30b9\u30c6\u30fc\u30c8\u30e1\u30f3\u30c8\u3068\u30eb\u30fc\u30d7\u30b9\u30c6\u30fc\u30c8\u30e1\u30f3\u30c8\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u3055\u307e\u3056\u307e\u306a\u6761\u4ef6\u306b\u57fa\u3065\u3044\u3066\u3055\u307e\u3056\u307e\u306a\u30a2\u30af\u30b7\u30e7\u30f3\u3092\u5b9f\u884c\u3057\u3001\u7279\u5b9a\u306e\u30b3\u30fc\u30c9\u30d6\u30ed\u30c3\u30af\u3092\u7e70\u308a\u8fd4\u3057\u5b9f\u884c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u4f8b\u5916\u51e6\u7406: Python \u306f\u4f8b\u5916\u51e6\u7406\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u30d7\u30ed\u30b0\u30e9\u30e0\u306e\u5b9f\u884c\u4e2d\u306b\u30a8\u30e9\u30fc\u3084\u4f8b\u5916\u72b6\u6cc1\u3092\u51e6\u7406\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30ea\u30b9\u30c8\u306e\u63a7\u9664\u3068\u30b8\u30a7\u30cd\u30ec\u30fc\u30bf: Python\u306f\u3001\u30ea\u30b9\u30c8\u3068\u30b8\u30a7\u30cd\u30ec\u30fc\u30bf\u3092\u3059\u3070\u3084\u304f\u4f5c\u6210\u3057\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u64cd\u4f5c\u3067\u304d\u308b\u30ea\u30b9\u30c8\u5185\u5305\u8868\u8a18\u3068\u30b8\u30a7\u30cd\u30ec\u30fc\u30bf\u6a5f\u80fd\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Python\u306f\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3084\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u306e\u51e6\u7406\u306b\u4f7f\u7528\u3067\u304d\u308b\u4e00\u822c\u7684\u306a\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3067\u3059\u3002 Spark \u306e Python API \u306f\u3001\u5909\u6570\u3001\u95a2\u6570\u3001\u30e2\u30b8\u30e5\u30fc\u30eb\u306a\u3069\u306e Python \u306e\u30b3\u30a2\u6982\u5ff5\u3092\u4f7f\u7528\u3057\u3066\u3001\u4f7f\u3044\u3084\u3059\u3044\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0 \u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30a4\u30b9\u3068\u8c4a\u5bcc\u306a\u6a5f\u80fd\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark SQL API \u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u5404\u5358\u8a9e\u306e\u51fa\u73fe\u56de\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3059\u308b\u5358\u7d14\u306a Python \u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pythonCopy code<code>from pyspark.sql import SparkSession\n\nspark = SparkSession.builder \\\n    .appName(\"PythonExample\") \\\n    .master(\"local[*]\") \\\n    .getOrCreate()\n\n# \u8bfb\u53d6CSV\u6587\u4ef6\u5230DataFrame\ndf = spark.read \\\n    .option(\"header\", \"true\") \\\n    .csv(\"path\/to\/file.csv\")\n\n# \u8ba1\u7b97\u6bcf\u4e2a\u5355\u8bcd\u7684\u51fa\u73b0\u6b21\u6570\nword_counts = df.select(\"text\") \\\n    .rdd.flatMap(lambda row: row[0].split(\" \")) \\\n    .map(lambda word: (word, 1)) \\\n    .reduceByKey(lambda a, b: a + b) \\\n    .toDF([\"word\", \"count\"])\n\n# \u663e\u793a\u7ed3\u679c\nword_counts.show()\n\nspark.stop()\n<\/code><\/pre>\n\n\n\n<p>\u3053\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u306f\u3001SparkSession API \u3092\u4f7f\u7528\u3057\u3066 SparkSession \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u4f5c\u6210\u3057\u3001\u8aad\u307f\u53d6\u308a\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 CSV \u30d5\u30a1\u30a4\u30eb\u304b\u3089\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u53d6\u308a\u3001\u305d\u306e\u7d50\u679c\u3092\u30c7\u30fc\u30bf\u30d5\u30ec\u30fc\u30e0\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001flatMap \u3084 map \u306a\u3069\u306e\u64cd\u4f5c\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u3092\u5909\u63db\u304a\u3088\u3073\u51e6\u7406\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001reduceByKey \u3084 toDF \u306a\u3069\u306e\u6f14\u7b97\u3092\u4f7f\u7528\u3057\u3066\u5404\u5358\u8a9e\u306e\u51fa\u73fe\u56de\u6570\u3092\u30ab\u30a6\u30f3\u30c8\u3057\u3001\u7d50\u679c\u3092word_counts\u5909\u6570\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001show \u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u7d50\u679c\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Python\u306e\u69cb\u6587\u306f\u7c21\u6f54\u3067\u3001\u8aad\u307f\u66f8\u304d\u304c\u7c21\u5358\u3067\u3001\u30e9\u30a4\u30d6\u30e9\u30ea\u3068\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u304c\u8c4a\u5bcc\u3067\u3059\u3002 \u3053\u308c\u306b\u3088\u308a\u3001Spark\u3092\u4f7f\u7528\u3057\u305f\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u306b\u6700\u9069\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3\u3001Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3\u306f\u3001Spark \u306e\u30af\u30e9\u30b9\u30bf\u30fc\u7ba1\u7406\u30e2\u30fc\u30c9\u3067\u3042\u308a\u3001\u5225\u306e\u30af\u30e9\u30b9\u30bf\u30fc\u3067 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u884c\u3067\u304d\u307e\u3059\u3002 Spark Standalone \u306f\u3001\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u3001\u30bf\u30b9\u30af\u306e\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u3001\u76e3\u8996\u306a\u3069\u3001\u30b7\u30f3\u30d7\u30eb\u3067\u3042\u308a\u306a\u304c\u3089\u5f37\u529b\u306a\u30af\u30e9\u30b9\u30bf\u30fc\u7ba1\u7406\u6a5f\u80fd\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3 \u30af\u30e9\u30b9\u30bf\u30fc \u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306e\u8a73\u7d30\u306a\u8aac\u660e\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30c9\u30e9\u30a4\u30d6 \u30d7\u30ed\u30b0\u30e9\u30e0: \u30c9\u30e9\u30a4\u30d6 \u30d7\u30ed\u30b0\u30e9\u30e0\u306f Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30a8\u30f3\u30c8\u30ea \u30dd\u30a4\u30f3\u30c8\u3067\u3042\u308a\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u8a08\u7b97\u30ed\u30b8\u30c3\u30af\u3092\u5b9a\u7fa9\u3057\u3001\u5b9f\u884c\u306e\u305f\u3081\u306b\u30af\u30e9\u30b9\u30bf\u30fc\u4e0a\u306e\u30ef\u30fc\u30ab\u30fc \u30ce\u30fc\u30c9\u306b\u30bf\u30b9\u30af\u3092\u9001\u4fe1\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30af\u30e9\u30b9\u30bf\u30fc \u30de\u30cd\u30fc\u30b8\u30e3\u30fc: \u30af\u30e9\u30b9\u30bf\u30fc \u30a2\u30c9\u30df\u30cb\u30b9\u30c8\u30ec\u30fc\u30bf\u30fc\u306f\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u4e0a\u306e Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3068\u30ef\u30fc\u30ab\u30fc \u30ce\u30fc\u30c9\u306e\u958b\u59cb\u3068\u505c\u6b62\u3092\u62c5\u5f53\u3059\u308b\u30d7\u30ed\u30bb\u30b9\u3067\u3059\u3002 Spark \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3 \u30af\u30e9\u30b9\u30bf\u30fc \u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u306f\u3001\u30bf\u30b9\u30af\u3092\u5206\u6563\u3057\u3066\u30af\u30e9\u30b9\u30bf\u30fc \u30ea\u30bd\u30fc\u30b9\u3092\u7ba1\u7406\u3059\u308b\u30de\u30b9\u30bf\u30fc\u3068\u30ef\u30fc\u30ab\u30fc\u3001\u304a\u3088\u3073\u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3057\u3066\u7d50\u679c\u3092\u8fd4\u3059\u30ef\u30fc\u30ab\u30fc\u306e 2 \u3064\u306e\u30d7\u30ed\u30bb\u30b9\u3092\u4f7f\u7528\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30af\u30e9\u30b9\u30bf\u30fc \u30ea\u30bd\u30fc\u30b9\u7ba1\u7406: \u30af\u30e9\u30b9\u30bf\u30fc \u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u306f\u3001CPU\u3001\u30e1\u30e2\u30ea\u3001\u30c7\u30a3\u30b9\u30af\u306a\u3069\u306e\u30af\u30e9\u30b9\u30bf\u30fc\u5185\u306e\u30ea\u30bd\u30fc\u30b9\u306e\u7ba1\u7406\u3092\u62c5\u5f53\u3057\u307e\u3059\u3002 Spark \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3 \u30af\u30e9\u30b9\u30bf\u30fc \u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u3067\u306f\u3001\u9759\u7684\u5272\u308a\u5f53\u3066\u3068\u52d5\u7684\u5272\u308a\u5f53\u3066\u306e 2 \u3064\u306e\u30ea\u30bd\u30fc\u30b9\u5272\u308a\u5f53\u3066\u30e2\u30fc\u30c9\u3092\u4f7f\u7528\u3057\u307e\u3059\u3002 \u9759\u7684\u5272\u308a\u632f\u308a\u3068\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u59cb\u52d5\u6642\u306b\u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u632f\u308b\u3053\u3068\u3092\u6307\u3057\u3001\u52d5\u7684\u5272\u308a\u632f\u308a\u3068\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30cb\u30fc\u30ba\u306b\u57fa\u3065\u3044\u3066\u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u632f\u308b\u3053\u3068\u3092\u6307\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30bf\u30b9\u30af\u306e\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0: \u30bf\u30b9\u30af\u306e\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30bf\u30b9\u30af\u3092\u30af\u30e9\u30b9\u30bf\u30fc\u4e0a\u306e\u30ef\u30fc\u30ab\u30fc \u30ce\u30fc\u30c9\u306b\u5206\u6563\u3057\u3066\u5b9f\u884c\u3057\u307e\u3059\u3002 Spark \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3 \u30af\u30e9\u30b9\u30bf\u30fc\u3067\u306f\u3001FIFO \u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u3068 FAIR \u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u306e 2 \u3064\u306e\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0 \u30e2\u30fc\u30c9\u304c\u4f7f\u7528\u3055\u308c\u307e\u3059\u3002 FIFO\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u3068\u306f\u3001\u9001\u4fe1\u3055\u308c\u305f\u9806\u5e8f\u306b\u5f93\u3063\u3066\u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3059\u308b\u3053\u3068\u3092\u6307\u3057\u3001FAIR\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u306f\u3001\u512a\u5148\u5ea6\u3068\u30ea\u30bd\u30fc\u30b9\u8981\u4ef6\u306b\u5f93\u3063\u3066\u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3059\u308b\u3053\u3068\u3092\u6307\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u76e3\u8996\u3068\u30ed\u30b0\u8a18\u9332: \u76e3\u8996\u3068\u30ed\u30b0\u8a18\u9332\u306f\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u72b6\u614b\u3068\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u76e3\u8996\u3057\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30ed\u30b0\u3068\u30a8\u30e9\u30fc\u60c5\u5831\u3092\u8a18\u9332\u3057\u307e\u3059\u3002 Spark \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3 \u30af\u30e9\u30b9\u30bf\u30fc\u306f\u3001Web \u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30a4\u30b9\u3084\u30b3\u30de\u30f3\u30c9 \u30e9\u30a4\u30f3 \u30c4\u30fc\u30eb\u306a\u3069\u3001\u8c4a\u5bcc\u306a\u76e3\u8996\u304a\u3088\u3073\u30ed\u30b0\u6a5f\u80fd\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u5168\u4f53\u3068\u3057\u3066\u3001Spark \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3 \u30af\u30e9\u30b9\u30bf\u30fc \u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306f\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3068\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u3092\u51e6\u7406\u3059\u308b\u305f\u3081\u306e\u30b7\u30f3\u30d7\u30eb\u3067\u3042\u308a\u306a\u304c\u3089\u5f37\u529b\u306a\u30af\u30e9\u30b9\u30bf\u30fc\u7ba1\u7406\u6a5f\u80fd\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002 Spark Standalone\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u30ea\u30bd\u30fc\u30b9\u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u306f\u4f7f\u3044\u3084\u3059\u3044\u7ba1\u7406\u30a4\u30f3\u30bf\u30fc\u30d5\u30a7\u30a4\u30b9\u3092\u63d0\u4f9b\u3057\u3001\u30bf\u30b9\u30af\u306e\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u3084\u76e3\u8996\u306a\u3069\u306e\u6a5f\u80fd\u306b\u3088\u308a\u3001\u9ad8\u3044\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u304c\u4fdd\u8a3c\u3055\u308c\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3 Hadoop YARN \u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Hadoop YARN \u306f\u3001Hadoop \u30a8\u30b3\u30b7\u30b9\u30c6\u30e0\u306e\u30b3\u30a2 \u30b3\u30f3\u30dd\u30fc\u30cd\u30f3\u30c8\u306e 1 \u3064\u3067\u3042\u308b\u5206\u6563\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u30b7\u30b9\u30c6\u30e0\u3067\u3059\u3002 \u3053\u308c\u306b\u3088\u308a\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u5168\u4f53\u3067\u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u5f53\u3066\u3001\u30b8\u30e7\u30d6\u3092\u7ba1\u7406\u3067\u304d\u3001Spark \u3092\u542b\u3080\u8907\u6570\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u304c\u30b5\u30dd\u30fc\u30c8\u3055\u308c\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u6b21\u306b\u3001Hadoop YARN \u30af\u30e9\u30b9\u30bf\u30fc \u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306e\u8a73\u7d30\u306a\u8aac\u660e\u3092\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30ea\u30bd\u30fc\u30b9 \u30de\u30cd\u30fc\u30b8\u30e3\u30fc: \u30ea\u30bd\u30fc\u30b9 \u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u306f\u3001\u30af\u30e9\u30b9\u30bf\u30fc \u30ea\u30bd\u30fc\u30b9\u3092\u7ba1\u7406\u304a\u3088\u3073\u5272\u308a\u5f53\u3066\u308b YARN \u30af\u30e9\u30b9\u30bf\u30fc\u5185\u306e\u30bb\u30f3\u30c8\u30e9\u30eb \u30ce\u30fc\u30c9\u3067\u3059\u3002 ResourceManager \u306f\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u5168\u4f53\u306e\u30ea\u30bd\u30fc\u30b9\u72b6\u614b\u3092\u7dad\u6301\u3057\u3001\u500b\u3005\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306b\u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u5f53\u3066\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30ce\u30fc\u30c9\u30de\u30cd\u30fc\u30b8\u30e3\u30fc: \u30ce\u30fc\u30c9\u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u306f\u3001\u5404\u30ef\u30fc\u30ab\u30fc\u30ce\u30fc\u30c9\u4e0a\u306e YARN \u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u3067\u3042\u308a\u3001CPU\u3001\u30e1\u30e2\u30ea\u3001\u30c7\u30a3\u30b9\u30af\u306a\u3069\u306e\u30ed\u30fc\u30ab\u30eb\u30ea\u30bd\u30fc\u30b9\u306e\u7ba1\u7406\u3092\u62c5\u5f53\u3057\u307e\u3059\u3002 NodeManager \u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3084\u95a2\u9023\u3059\u308b\u4f9d\u5b58\u95a2\u4fc2\u3092\u542b\u3080\u3001\u95a2\u9023\u3059\u308b\u30d7\u30ed\u30bb\u30b9\u306e\u30bb\u30c3\u30c8\u3067\u3042\u308b\u30b3\u30f3\u30c6\u30ca\u30fc\u306e\u958b\u59cb\u3068\u505c\u6b62\u3092\u62c5\u5f53\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u30de\u30b9\u30bf\u30fc: \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u30de\u30b9\u30bf\u30fc\u306f\u3001\u5404\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30de\u30b9\u30bf\u30fc\u30ce\u30fc\u30c9\u3067\u3042\u308a\u3001YARN\u30af\u30e9\u30b9\u30bf\u30fc\u5185\u306e\u30ea\u30bd\u30fc\u30b9\u306e\u5272\u308a\u5f53\u3066\u3068\u7ba1\u7406\u3001\u304a\u3088\u3073\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c\u306e\u8abf\u6574\u3092\u62c5\u5f53\u3057\u307e\u3059\u3002 \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u30de\u30b9\u30bf\u30fc\u306f\u3001\u30ea\u30bd\u30fc\u30b9\u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u3067\u30ea\u30bd\u30fc\u30b9\u3092\u8981\u6c42\u3057\u3001\u30ce\u30fc\u30c9\u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u3068\u901a\u4fe1\u3057\u3066\u30b3\u30f3\u30c6\u30ca\u3092\u958b\u59cb\u304a\u3088\u3073\u505c\u6b62\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30b3\u30f3\u30c6\u30ca\u30fc: \u30b3\u30f3\u30c6\u30ca\u30fc\u3068\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3084\u95a2\u9023\u3059\u308b\u4f9d\u5b58\u95a2\u4fc2\u306a\u3069\u3001\u95a2\u9023\u3059\u308b\u4e00\u9023\u306e\u30d7\u30ed\u30bb\u30b9\u3092\u6307\u3057\u307e\u3059\u3002 YARN \u306f\u3001\u30b3\u30f3\u30c6\u30ca\u30fc\u3092\u4f7f\u7528\u3057\u3066\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u884c\u3057\u3001\u30ea\u30bd\u30fc\u30b9\u3092\u7ba1\u7406\u3057\u307e\u3059\u3002 \u30b3\u30f3\u30c6\u30ca\u30fc\u306f NodeManager \u306b\u3088\u3063\u3066\u958b\u59cb\u3055\u308c\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30de\u30b9\u30bf\u30fc\u306e\u6307\u793a\u306e\u4e0b\u3067\u5b9f\u884c\u3055\u308c\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30b3\u30f3\u30c6\u30ca\u30fc\u306e\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0: \u30b3\u30f3\u30c6\u30ca\u30fc\u306e\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u3068\u306f\u3001\u30b3\u30f3\u30c6\u30ca\u30fc\u3092\u30ef\u30fc\u30ab\u30fc \u30ce\u30fc\u30c9\u3068\u30ea\u30bd\u30fc\u30b9\u306b\u5272\u308a\u5f53\u3066\u308b\u30d7\u30ed\u30bb\u30b9\u3092\u6307\u3057\u307e\u3059\u3002 YARN \u306f\u3001\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0 \u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u4f7f\u7528\u3057\u3066\u3001\u3069\u306e\u30b3\u30f3\u30c6\u30ca\u30fc\u304c\u3069\u306e\u30ef\u30fc\u30ab\u30fc \u30ce\u30fc\u30c9\u3067\u5b9f\u884c\u3055\u308c\u308b\u304b\u3092\u6c7a\u5b9a\u3057\u3001\u30b3\u30f3\u30c6\u30ca\u30fc\u306e\u958b\u59cb\u3068\u505c\u6b62\u3092\u7ba1\u7406\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u5168\u4f53\u3068\u3057\u3066\u3001Hadoop YARN\u30af\u30e9\u30b9\u30bf\u30fc\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306f\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3068\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u3092\u51e6\u7406\u3059\u308b\u305f\u3081\u306e\u5f37\u529b\u3067\u67d4\u8edf\u306a\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u30b7\u30b9\u30c6\u30e0\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002 ResourceManager \u3068 NodeManager \u306f\u4f7f\u3044\u3084\u3059\u3044\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u6a5f\u80fd\u3092\u63d0\u4f9b\u3057\u3001ApplicationMaster \u3084\u30b3\u30f3\u30c6\u30ca\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u306a\u3069\u306e\u6a5f\u80fd\u306f\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u9ad8\u3044\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u4fdd\u8a3c\u3057\u307e\u3059\u3002 Spark \u306f\u3001HAoop \u30a8\u30b3\u30b7\u30b9\u30c6\u30e0\u306e\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u304a\u3088\u3073\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u6a5f\u80fd\u3092\u5229\u7528\u3059\u308b\u305f\u3081\u306b\u3001YARN \u30af\u30e9\u30a4\u30a2\u30f3\u30c8 \u30b5\u30dd\u30fc\u30c8\u3092\u4ecb\u3057\u3066 YARN \u30af\u30e9\u30b9\u30bf\u30fc\u4e0a\u3067\u5b9f\u884c\u3055\u308c\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Apache Mesos \u306b\u3064\u3044\u3066\u3001Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Apache Mesos \u306f\u3001Spark \u3092\u542b\u3080\u3055\u307e\u3056\u307e\u306a\u7a2e\u985e\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u7ba1\u7406\u306b\u4f7f\u7528\u3067\u304d\u308b\u6c4e\u7528\u30af\u30e9\u30b9\u30bf\u30fc\u7ba1\u7406\u30b7\u30b9\u30c6\u30e0\u3067\u3059\u3002 Mesos\u306f\u3001\u30ea\u30bd\u30fc\u30b9\u3092\u30d7\u30fc\u30eb\u3057\u3066\u52d5\u7684\u306b\u5272\u308a\u5f53\u3066\u308b\u3053\u3068\u3067\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30ea\u30bd\u30fc\u30b9\u4f7f\u7528\u7387\u3068\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u5411\u4e0a\u3055\u305b\u3001\u8907\u6570\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u306e\u5171\u5b58\u3068\u5171\u6709\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<p>\u6b21\u306b\u3001Apache Mesos \u30af\u30e9\u30b9\u30bf\u30fc \u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306e\u8a73\u7d30\u306a\u8aac\u660e\u3092\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Mesos\u30de\u30b9\u30bf\u30fc:Mesos\u30de\u30b9\u30bf\u30fc\u306fMesos\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u4e2d\u592e\u30ce\u30fc\u30c9\u3067\u3042\u308a\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u30ea\u30bd\u30fc\u30b9\u306e\u7ba1\u7406\u3068\u5272\u308a\u5f53\u3066\u3092\u62c5\u5f53\u3057\u307e\u3059\u3002 Mesos Master\u306f\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u5168\u4f53\u306e\u30ea\u30bd\u30fc\u30b9\u72b6\u614b\u3092\u7dad\u6301\u3057\u3001\u500b\u3005\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306b\u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u5f53\u3066\u307e\u3059\u3002<\/li>\n\n\n\n<li>Mesos \u30a8\u30fc\u30b8\u30a7\u30f3\u30c8: Mesos \u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306f\u3001\u5404\u30ef\u30fc\u30ab\u30fc \u30ce\u30fc\u30c9\u4e0a\u306e Mesos \u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u3067\u3042\u308a\u3001CPU\u3001\u30e1\u30e2\u30ea\u3001\u30c7\u30a3\u30b9\u30af\u306a\u3069\u306e\u30ed\u30fc\u30ab\u30eb \u30ea\u30bd\u30fc\u30b9\u306e\u7ba1\u7406\u3092\u62c5\u5f53\u3057\u307e\u3059\u3002 Mesos\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306f\u3001\u30bf\u30b9\u30af\u306e\u958b\u59cb\u3068\u505c\u6b62\u3001\u304a\u3088\u3073Mesos\u30de\u30b9\u30bf\u30fc\u3068\u901a\u4fe1\u3057\u3066\u30bf\u30b9\u30af\u306e\u5272\u308a\u5f53\u3066\u3068\u30b9\u30c6\u30fc\u30bf\u30b9\u306e\u66f4\u65b0\u3092\u53d7\u4fe1\u3059\u308b\u8cac\u4efb\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af:\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3068\u306f\u3001Spark\u306a\u3069\u306eMesos\u3067\u5b9f\u884c\u3055\u308c\u308b\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u6307\u3057\u307e\u3059\u3002 \u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u306f\u3001Mesos\u30de\u30b9\u30bf\u30fc\u306b\u30ea\u30bd\u30fc\u30b9\u3092\u30ea\u30af\u30a8\u30b9\u30c8\u3057\u3001Mesos\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306e\u5b9f\u884c\u306b\u30bf\u30b9\u30af\u3092\u5272\u308a\u5f53\u3066\u308b\u8cac\u4efb\u304c\u3042\u308a\u307e\u3059\u3002 \u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u306f\u3001\u30bf\u30b9\u30af\u306e\u5b9f\u884c\u3092\u8abf\u6574\u3057\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30b9\u30c6\u30fc\u30bf\u30b9\u3092\u76e3\u8996\u3059\u308b\u5f79\u5272\u3082\u62c5\u3044\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30bf\u30b9\u30af:\u30bf\u30b9\u30af\u306f\u3001Mesos\u3067\u5b9f\u884c\u3055\u308c\u3066\u3044\u308b\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c\u5358\u4f4d\u3067\u3059\u3002 \u5404\u30bf\u30b9\u30af\u306f\u3001\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u306b\u3088\u3063\u3066\u5b9f\u884c\u306e\u305f\u3081\u306bMesos\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306b\u5272\u308a\u5f53\u3066\u3089\u308c\u30011\u3064\u4ee5\u4e0a\u306e\u30d7\u30ed\u30bb\u30b9\u3092\u542b\u3081\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30ea\u30bd\u30fc\u30b9\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0:\u30ea\u30bd\u30fc\u30b9\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u3068\u306f\u3001\u5229\u7528\u53ef\u80fd\u306aMesos\u30a8\u30fc\u30b8\u30a7\u30f3\u30c8\u306b\u30bf\u30b9\u30af\u3092\u5272\u308a\u5f53\u3066\u308b\u30d7\u30ed\u30bb\u30b9\u3092\u6307\u3057\u307e\u3059\u3002 Mesos\u306f\u3001\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u4f7f\u7528\u3057\u3066\u3001\u3069\u306e\u30ef\u30fc\u30ab\u30fc\u30ce\u30fc\u30c9\u3067\u3069\u306e\u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3059\u308b\u304b\u3092\u6c7a\u5b9a\u3057\u3001\u30bf\u30b9\u30af\u306e\u958b\u59cb\u3068\u505c\u6b62\u3092\u7ba1\u7406\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u5168\u4f53\u3068\u3057\u3066\u3001Apache Mesos\u30af\u30e9\u30b9\u30bf\u30fc\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306f\u3001\u5927\u898f\u6a21\u306a\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3068\u5206\u6563\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u3092\u51e6\u7406\u3059\u308b\u305f\u3081\u306e\u4e00\u822c\u7684\u3067\u67d4\u8edf\u306a\u30af\u30e9\u30b9\u30bf\u30fc\u7ba1\u7406\u30b7\u30b9\u30c6\u30e0\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002 Mesos Master\u3068Mesos Agent\u306f\u4f7f\u3044\u3084\u3059\u3044\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u6a5f\u80fd\u3092\u63d0\u4f9b\u3057\u3001\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3084\u30ea\u30bd\u30fc\u30b9\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u306a\u3069\u306e\u6a5f\u80fd\u306f\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u9ad8\u3044\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u4fdd\u8a3c\u3057\u307e\u3059\u3002 Spark\u306f\u3001Mesos\u30af\u30e9\u30a4\u30a2\u30f3\u30c8\u30b5\u30dd\u30fc\u30c8\u3092\u901a\u3058\u3066Mesos\u30af\u30e9\u30b9\u30bf\u30fc\u4e0a\u3067\u5b9f\u884c\u3055\u308c\u3001Mesos\u306e\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u304a\u3088\u3073\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u6a5f\u80fd\u3092\u6d3b\u7528\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30e9\u30a4\u30d5\u30b5\u30a4\u30af\u30eb\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30e9\u30a4\u30d5\u30b5\u30a4\u30af\u30eb\u306f\u3001\u4f5c\u6210\u3001\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3001\u9001\u4fe1\u3001\u5b9f\u884c\u306e 4 \u3064\u306e\u30d5\u30a7\u30fc\u30ba\u3067\u69cb\u6210\u3055\u308c\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u66f8\u304d\u8fbc\u307f: \u66f8\u304d\u8fbc\u307f\u30d5\u30a7\u30fc\u30ba\u3067\u306f\u3001Spark API \u3092\u4f7f\u7528\u3057\u3066\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3092\u8a18\u8ff0\u3057\u3001\u30c7\u30fc\u30bf\u51e6\u7406\u30ed\u30b8\u30c3\u30af\u3068\u64cd\u4f5c\u30d5\u30ed\u30fc\u3092\u5b9a\u7fa9\u3057\u307e\u3059\u3002 Spark API \u306f\u3001Scala\u3001Java\u3001Python \u306a\u3069\u3001\u3055\u307e\u3056\u307e\u306a\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u8a00\u8a9e\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u3044\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30d1\u30c3\u30b1\u30fc\u30b8\u5316: \u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u30d5\u30a7\u30fc\u30ba\u3067\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3092\u5b9f\u884c\u53ef\u80fd\u306a JAR \u30d5\u30a1\u30a4\u30eb\u307e\u305f\u306f Python egg \u30d5\u30a1\u30a4\u30eb\u306b\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3057\u3066\u3001Spark \u30af\u30e9\u30b9\u30bf\u30fc\u3067\u5b9f\u884c\u3057\u307e\u3059\u3002 \u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u30d7\u30ed\u30bb\u30b9\u3067\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3001\u95a2\u9023\u30e9\u30a4\u30d6\u30e9\u30ea\u3001\u4f9d\u5b58\u95a2\u4fc2\u306a\u3069\u304c\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3055\u308c\u3001\u914d\u7f6e\u3068\u5b9f\u884c\u304c\u5bb9\u6613\u306b\u306a\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30b3\u30df\u30c3\u30c8: \u30b3\u30df\u30c3\u30c8 \u30d5\u30a7\u30fc\u30ba\u3067\u306f\u3001Spark \u30b3\u30df\u30c3\u30c8 \u30b9\u30af\u30ea\u30d7\u30c8\u307e\u305f\u306f\u30b3\u30de\u30f3\u30c9 \u30e9\u30a4\u30f3 \u30c4\u30fc\u30eb\u3092\u4f7f\u7528\u3057\u3066\u3001\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3055\u308c\u305f\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092 Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306b\u9001\u4fe1\u3057\u307e\u3059\u3002 \u30b3\u30df\u30c3\u30c8 \u30d7\u30ed\u30bb\u30b9\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3001\u69cb\u6210\u60c5\u5831\u3001\u304a\u3088\u3073\u30b3\u30de\u30f3\u30c9 \u30e9\u30a4\u30f3 \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092 Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306b\u9001\u4fe1\u3057\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c\u3092\u958b\u59cb\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u5b9f\u884c: \u5b9f\u884c\u30d5\u30a7\u30fc\u30ba\u4e2d\u306b\u3001Spark \u306f\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c\u3092\u958b\u59cb\u3057\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3068\u69cb\u6210\u60c5\u5831\u306b\u57fa\u3065\u3044\u3066\u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u5f53\u3066\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u3067\u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3057\u307e\u3059\u3002 Spark \u306f\u3001Spark SQL\u3001Spark Streaming\u3001Spark MLlib\u3001Spark GraphX \u306a\u3069\u3001\u3055\u307e\u3056\u307e\u306a\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3066\u304a\u308a\u3001\u305d\u308c\u305e\u308c\u306b\u72ec\u81ea\u306e\u5b9f\u884c\u30d7\u30ed\u30bb\u30b9\u3068\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u30e1\u30ab\u30cb\u30ba\u30e0\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u8981\u7d04\u3059\u308b\u3068\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30e9\u30a4\u30d5\u30b5\u30a4\u30af\u30eb\u306f\u3001\u66f8\u304d\u8fbc\u307f\u3001\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3001\u30b3\u30df\u30c3\u30c8\u3001\u5b9f\u884c\u306e 4 \u3064\u306e\u30d5\u30a7\u30fc\u30ba\u3067\u69cb\u6210\u3055\u308c\u307e\u3059\u3002 \u5404\u30b9\u30c6\u30fc\u30b8\u306b\u306f\u3001\u614e\u91cd\u306b\u8a08\u753b\u3057\u3066\u5b9f\u884c\u3059\u308b\u5fc5\u8981\u304c\u3042\u308b\u72ec\u81ea\u306e\u30bf\u30b9\u30af\u3068\u30b9\u30c6\u30c3\u30d7\u304c\u3042\u308a\u307e\u3059\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30e9\u30a4\u30d5\u30b5\u30a4\u30af\u30eb\u3092\u7406\u89e3\u3059\u308b\u3053\u3068\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u7406\u89e3\u3059\u308b\u305f\u3081\u306e\u9375\u3067\u3042\u308a\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u6700\u9069\u5316\u3059\u308b\u306e\u306b\u5f79\u7acb\u3061\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u65b9\u6cd5\u3092\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u306f\u3001\u305d\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u6700\u9069\u5316\u3059\u308b\u305f\u3081\u306e\u91cd\u8981\u306a\u624b\u9806\u3067\u3059\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u306f\u3001Spark \u30af\u30e9\u30b9\u30bf\u30fc\u69cb\u6210\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u69cb\u6210\u3001\u30e9\u30f3\u30bf\u30a4\u30e0\u69cb\u6210\u306a\u3069\u3001\u8907\u6570\u306e\u30ec\u30d9\u30eb\u3067\u5b9f\u884c\u3067\u304d\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u306e\u5185\u8a33\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Spark \u30af\u30e9\u30b9\u30bf\u30fc\u69cb\u6210: Spark \u30af\u30e9\u30b9\u30bf\u30fc\u69cb\u6210\u3068\u306f\u3001\u30af\u30e9\u30b9\u30bf\u30fc \u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u3001\u30ea\u30bd\u30fc\u30b9 \u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u3001\u30b9\u30b1\u30b8\u30e5\u30fc\u30e9\u30fc\u3001\u76e3\u8996\u30c4\u30fc\u30eb\u306a\u3069\u3001Spark \u30af\u30e9\u30b9\u30bf\u30fc\u3092\u69cb\u6210\u3059\u308b\u305f\u3081\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u6307\u3057\u307e\u3059\u3002 \u3053\u308c\u3089\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u306f\u3001spark-env.sh \u3084 spark-defaults.conf \u306a\u3069\u306e Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u69cb\u6210\u30d5\u30a1\u30a4\u30eb\u3067\u8a2d\u5b9a\u3067\u304d\u307e\u3059\u3002 Spark \u30af\u30e9\u30b9\u30bf\u30fc\u69cb\u6210\u306e\u8a2d\u5b9a\u306f\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u3067\u5b9f\u884c\u3055\u308c\u3066\u3044\u308b\u3059\u3079\u3066\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3067\u5171\u6709\u3055\u308c\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u69cb\u6210: \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u69cb\u6210\u3068\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3001\u30c7\u30fc\u30bf\u306e\u5165\u51fa\u529b\u30d1\u30b9\u3001\u30e1\u30e2\u30ea\u3068 CPU \u30ea\u30bd\u30fc\u30b9\u306e\u5272\u308a\u5f53\u3066\u3001\u30bf\u30b9\u30af \u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0 \u30dd\u30ea\u30b7\u30fc\u3001\u76e3\u8996\u3068\u30ed\u30b0\u8a18\u9332\u306a\u3069\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u69cb\u6210\u3059\u308b\u3053\u3068\u3067\u3059\u3002 \u3053\u308c\u3089\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u306f\u3001SparkConf \u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3084\u30b3\u30de\u30f3\u30c9 \u30e9\u30a4\u30f3\u5f15\u6570\u306a\u3069\u3092\u4f7f\u7528\u3057\u3066\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3067\u8a2d\u5b9a\u3067\u304d\u307e\u3059\u3002 \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u69cb\u6210\u306e\u8a2d\u5b9a\u306f\u3001\u305d\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c\u306b\u306e\u307f\u9069\u7528\u3055\u308c\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30e9\u30f3\u30bf\u30a4\u30e0\u69cb\u6210: \u30e9\u30f3\u30bf\u30a4\u30e0\u69cb\u6210\u3068\u306f\u3001Spark \u30b7\u30a7\u30eb\u3092\u4f7f\u7528\u3059\u308b\u5834\u5408\u3084 Spark-submit \u30b3\u30de\u30f3\u30c9\u3067\u4f7f\u7528\u3055\u308c\u308b\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u306a\u3069\u3001\u5b9f\u884c\u6642\u306b Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u69cb\u6210\u3059\u308b\u3053\u3068\u3067\u3059\u3002 \u3053\u308c\u3089\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u306f\u3001\u30de\u30b9\u30bf\u30fc URL\u3001\u30a8\u30b0\u30bc\u30ad\u30e5\u30fc\u30bf\u30fc \u30e1\u30e2\u30ea\u3068 CPU \u30b3\u30a2\u3001\u30c9\u30e9\u30a4\u30d0\u30fc \u30e1\u30e2\u30ea\u3001CPU \u30b3\u30a2\u306a\u3069\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30e9\u30f3\u30bf\u30a4\u30e0\u74b0\u5883\u3092\u8a2d\u5b9a\u3067\u304d\u307e\u3059\u3002 \u5b9f\u884c\u6642\u306b\u69cb\u6210\u3055\u308c\u305f\u8a2d\u5b9a\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u69cb\u6210\u3068\u30af\u30e9\u30b9\u30bf\u30fc\u69cb\u6210\u306e\u4e21\u65b9\u306e\u8a2d\u5b9a\u3092\u30aa\u30fc\u30d0\u30fc\u30e9\u30a4\u30c9\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u69cb\u6210\u3059\u308b\u3068\u304d\u306f\u3001\u8003\u616e\u3059\u3079\u304d\u3044\u304f\u3064\u304b\u306e\u5074\u9762\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u3068\u30ea\u30bd\u30fc\u30b9: Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u3068\u30ea\u30bd\u30fc\u30b9\u3092\u69cb\u6210\u3059\u308b\u3068\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u306b\u5f71\u97ff\u3057\u307e\u3059\u3002 \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30ea\u30bd\u30fc\u30b9\u8981\u4ef6\u3068\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u8ca0\u8377\u306b\u5fdc\u3058\u3066\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u306eCPU\u3001\u30e1\u30e2\u30ea\u3001\u30c7\u30a3\u30b9\u30af\u306a\u3069\u306e\u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u5f53\u3066\u3001\u30bf\u30b9\u30af\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u30dd\u30ea\u30b7\u30fc\u3068\u30ea\u30bd\u30fc\u30b9\u5272\u308a\u5f53\u3066\u30e2\u30fc\u30c9\u3092\u8a2d\u5b9a\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3068\u30c7\u30fc\u30bf: \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3068\u30c7\u30fc\u30bf\u306e\u8a2d\u8a08\u3068\u7de8\u6210\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u306b\u5f71\u97ff\u3057\u307e\u3059\u3002 \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30b3\u30fc\u30c9\u69cb\u9020\u3068\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u306e\u5b9f\u88c5\u3092\u6700\u9069\u5316\u3057\u3066\u3001\u30c7\u30fc\u30bf\u8ee2\u9001\u3068\u51e6\u7406\u6642\u9593\u3092\u77ed\u7e2e\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u540c\u6642\u306b\u3001\u5165\u529b\u3068\u51fa\u529b\u306e\u30d1\u30b9\u3068\u30d5\u30a9\u30fc\u30de\u30c3\u30c8\u3092\u8a2d\u5b9a\u3057\u3066\u3001\u30c7\u30fc\u30bf\u3092\u52b9\u7387\u7684\u306b\u8aad\u307f\u66f8\u304d\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u3068\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc: \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u3068\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u306b\u5f71\u97ff\u3057\u307e\u3059\u3002 \u30e1\u30e2\u30ea\u3068 CPU \u30ea\u30bd\u30fc\u30b9\u306e\u5272\u308a\u5f53\u3066\u3001\u30bf\u30b9\u30af \u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0 \u30dd\u30ea\u30b7\u30fc\u3001\u76e3\u8996\u3068\u30ed\u30ae\u30f3\u30b0\u306a\u3069\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30d1\u30e9\u30e1\u30fc\u30bf\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30cb\u30fc\u30ba\u3068\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2 \u30ea\u30bd\u30fc\u30b9\u306e\u5236\u9650\u306b\u5fdc\u3058\u3066\u69cb\u6210\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u5168\u4f53\u3068\u3057\u3066\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u306f\u3001\u305d\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u6700\u9069\u5316\u3059\u308b\u305f\u3081\u306e\u91cd\u8981\u306a\u624b\u9806\u3067\u3059\u3002 \u6700\u9069\u306a\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u5b9f\u884c\u3092\u5b9f\u73fe\u3059\u308b\u306b\u306f\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u3068\u30ea\u30bd\u30fc\u30b9\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3068\u30c7\u30fc\u30bf\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u3068\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u306a\u3069\u306e\u5074\u9762\u306b\u57fa\u3065\u3044\u3066\u8a2d\u5b9a\u3092\u884c\u3046\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30c7\u30d7\u30ed\u30a4\u3068\u904b\u7528\u306b\u3064\u3044\u3066\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30c7\u30d7\u30ed\u30a4\u3068\u5b9f\u884c\u306b\u306f\u3001\u74b0\u5883\u306e\u6e96\u5099\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u9001\u4fe1\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u76e3\u8996\u3068\u30c7\u30d0\u30c3\u30b0\u306a\u3069\u3001\u8907\u6570\u306e\u624b\u9806\u304c\u542b\u307e\u308c\u307e\u3059\u3002<\/p>\n\n\n\n<p>Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30c7\u30d7\u30ed\u30a4\u65b9\u6cd5\u3068\u5b9f\u884c\u65b9\u6cd5\u306e\u8a73\u7d30\u306a\u8aac\u660e\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u74b0\u5883\u3092\u6e96\u5099\u3059\u308b: \u74b0\u5883\u306e\u6e96\u5099\u30d5\u30a7\u30fc\u30ba\u3067\u306f\u3001Java \u3084 Spark \u306a\u3069\u306e\u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u306e\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3001\u74b0\u5883\u5909\u6570\u3068\u69cb\u6210\u30d5\u30a1\u30a4\u30eb\u306e\u8a2d\u5b9a\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u4f5c\u6210\u3001\u30ea\u30bd\u30fc\u30b9\u306e\u5272\u308a\u5f53\u3066\u306a\u3069\u3001Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30e9\u30f3\u30bf\u30a4\u30e0\u74b0\u5883\u3092\u6e96\u5099\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u3055\u3089\u306b\u3001\u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u3068\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u63a5\u7d9a\u3092\u78ba\u8a8d\u3057\u3001\u305d\u308c\u3089\u304c\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u8981\u4ef6\u3092\u6e80\u305f\u3057\u3066\u3044\u308b\u3053\u3068\u3092\u78ba\u8a8d\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3059\u308b: \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u30d5\u30a7\u30fc\u30ba\u3067\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u30b3\u30fc\u30c9\u3092\u5b9f\u884c\u53ef\u80fd\u306aJAR\u30d5\u30a1\u30a4\u30eb\u307e\u305f\u306fPython\u30a8\u30c3\u30b0\u30d5\u30a1\u30a4\u30eb\u306b\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u30d7\u30ed\u30bb\u30b9\u3067\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3001\u95a2\u9023\u30e9\u30a4\u30d6\u30e9\u30ea\u3001\u4f9d\u5b58\u95a2\u4fc2\u306a\u3069\u304c\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3055\u308c\u3001\u914d\u7f6e\u3068\u5b9f\u884c\u304c\u5bb9\u6613\u306b\u306a\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u9001\u4fe1\u3059\u308b: \u9001\u4fe1\u30d5\u30a7\u30fc\u30ba\u3067\u306f\u3001Spark-submit \u30b3\u30de\u30f3\u30c9 \u30e9\u30a4\u30f3 \u30c4\u30fc\u30eb\u307e\u305f\u306f\u305d\u306e\u4ed6\u306e\u30b3\u30df\u30c3\u30c8 \u30b9\u30af\u30ea\u30d7\u30c8\u3092\u4f7f\u7528\u3057\u3066\u3001\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3055\u308c\u305f\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092 Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306b\u9001\u4fe1\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u30b3\u30df\u30c3\u30c8 \u30d7\u30ed\u30bb\u30b9\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b3\u30fc\u30c9\u3001\u69cb\u6210\u60c5\u5831\u3001\u304a\u3088\u3073\u30b3\u30de\u30f3\u30c9 \u30e9\u30a4\u30f3 \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092 Spark \u30af\u30e9\u30b9\u30bf\u30fc\u306b\u9001\u4fe1\u3057\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c\u3092\u958b\u59cb\u3057\u307e\u3059\u3002 \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u9001\u4fe1\u3059\u308b\u3068\u304d\u306b\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30e1\u30a4\u30f3\u30af\u30e9\u30b9\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30ea\u30bd\u30fc\u30b9\u8981\u4ef6\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u304c\u5b9f\u884c\u3055\u308c\u308b\u30de\u30b9\u30bf\u30fc URL\u3001\u304a\u3088\u3073\u305d\u306e\u4ed6\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u6307\u5b9a\u3057\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u76e3\u8996\u3068\u30c7\u30d0\u30c3\u30b0: \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c\u4e2d\u306b\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u304c\u6b63\u3057\u304f\u5b9f\u884c\u3055\u308c\u3001\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u304c\u6700\u9069\u5316\u3055\u308c\u308b\u3088\u3046\u306b\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u76e3\u8996\u304a\u3088\u3073\u30c7\u30d0\u30c3\u30b0\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 Spark \u306b\u306f\u3001Spark Web UI\u3001Spark \u5c65\u6b74\u30b5\u30fc\u30d0\u30fc\u3001Spark \u30ed\u30b0\u306a\u3069\u3001\u3055\u307e\u3056\u307e\u306a\u76e3\u8996\u30c4\u30fc\u30eb\u3068\u30c7\u30d0\u30c3\u30b0 \u30c4\u30fc\u30eb\u304c\u7528\u610f\u3055\u308c\u3066\u3044\u307e\u3059\u3002 \u3053\u308c\u3089\u306e\u30c4\u30fc\u30eb\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u72b6\u614b\u3068\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u7406\u89e3\u3057\u3001\u554f\u984c\u3092\u8a3a\u65ad\u3057\u3066\u89e3\u6c7a\u3059\u308b\u306e\u306b\u5f79\u7acb\u3061\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u5168\u4f53\u3068\u3057\u3066\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30c7\u30d7\u30ed\u30a4\u3068\u5b9f\u884c\u306b\u306f\u3001\u614e\u91cd\u306a\u8a08\u753b\u3068\u5b9f\u884c\u3092\u5fc5\u8981\u3068\u3059\u308b\u8907\u6570\u306e\u624b\u9806\u304c\u542b\u307e\u308c\u307e\u3059\u3002 \u74b0\u5883\u306e\u6e96\u5099\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30c3\u30b1\u30fc\u30b8\u5316\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u9001\u4fe1\u3001\u304a\u3088\u3073\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u76e3\u8996\u3068\u30c7\u30d0\u30c3\u30b0\u306b\u306f\u3001\u7279\u5225\u306a\u6ce8\u610f\u304c\u5fc5\u8981\u3067\u3059\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30c7\u30d7\u30ed\u30a4\u3068\u904b\u7528\u30d7\u30ed\u30bb\u30b9\u3092\u7406\u89e3\u3059\u308b\u3053\u3068\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u7406\u89e3\u3059\u308b\u305f\u3081\u306e\u9375\u3067\u3042\u308a\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u6700\u9069\u5316\u3059\u308b\u306e\u306b\u5f79\u7acb\u3061\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u5074\u304b\u3089Spark\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u65b9\u6cd5\u3092\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 \u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u5074\u304b\u3089 Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u3068\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u5411\u4e0a\u3055\u305b\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 \u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u306e\u89b3\u70b9\u304b\u3089 Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u305f\u3081\u306e\u63a8\u5968\u4e8b\u9805\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30e1\u30e2\u30ea: Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3067\u306f\u3001\u30c7\u30fc\u30bf\u3092\u30ad\u30e3\u30c3\u30b7\u30e5\u3057\u3066\u8a08\u7b97\u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3059\u308b\u305f\u3081\u306b\u5927\u91cf\u306e\u30e1\u30e2\u30ea\u304c\u5fc5\u8981\u3067\u3059\u3002 \u3057\u305f\u304c\u3063\u3066\u3001\u30e1\u30e2\u30ea\u306f Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u305f\u3081\u306e\u91cd\u8981\u306a\u8981\u7d20\u306e 64 \u3064\u3067\u3059\u3002 \u3088\u308a\u591a\u304f\u306e\u30c7\u30fc\u30bf\u3092\u30ad\u30e3\u30c3\u30b7\u30e5\u3057\u3001\u3088\u308a\u5927\u304d\u306a\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u3092\u30b5\u30dd\u30fc\u30c8\u3059\u308b\u305f\u3081\u306b\u3001\u53ef\u80fd\u306a\u9650\u308a&lt;&gt;GB\u4ee5\u4e0a\u306e\u5927\u5bb9\u91cf\u30e1\u30e2\u30ea\u3092\u4f7f\u7528\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u540c\u6642\u306b\u3001\u6700\u9069\u306a\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u5b89\u5b9a\u6027\u3092\u78ba\u4fdd\u3059\u308b\u305f\u3081\u306b\u3001\u30c9\u30e9\u30a4\u30d0\u30fc\u3001Executor\u3001\u30ad\u30e3\u30c3\u30b7\u30e5\u306a\u3069\u306e\u3055\u307e\u3056\u307e\u306a\u30b3\u30f3\u30dd\u30fc\u30cd\u30f3\u30c8\u306b\u30e1\u30e2\u30ea\u3092\u5272\u308a\u5f53\u3066\u3066\u3001\u30e1\u30e2\u30ea\u3092\u8ce2\u304f\u5272\u308a\u5f53\u3066\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>CPU: CPU \u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c\u306b\u304a\u3051\u308b\u3082\u3046 8 \u3064\u306e\u91cd\u8981\u306a\u8981\u7d20\u3067\u3059\u3002 \u53ef\u80fd\u306a\u9650\u308a\u3001&lt;&gt; \u30b3\u30a2\u4ee5\u4e0a\u306e\u30de\u30eb\u30c1\u30b3\u30a2 CPU \u3092\u4f7f\u7528\u3057\u3066\u3001\u3088\u308a\u591a\u304f\u306e\u540c\u6642\u30bf\u30b9\u30af\u3068\u3088\u308a\u9ad8\u3044\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0 \u30b9\u30eb\u30fc\u30d7\u30c3\u30c8\u3092\u30b5\u30dd\u30fc\u30c8\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u540c\u6642\u306b\u3001CPU \u30ea\u30bd\u30fc\u30b9\u3092\u5408\u7406\u7684\u306b\u5272\u308a\u5f53\u3066\u3066\u3001Executor \u3068\u30bf\u30b9\u30af\u306b\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0 \u30bf\u30b9\u30af\u3092\u5b9f\u884c\u3059\u308b\u306e\u306b\u5341\u5206\u306a CPU \u30ea\u30bd\u30fc\u30b9\u3092\u78ba\u4fdd\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30b9\u30c8\u30ec\u30fc\u30b8: \u30b9\u30c8\u30ec\u30fc\u30b8\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3067\u3088\u304f\u4f7f\u7528\u3055\u308c\u308b\u5225\u306e\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2 \u30b3\u30f3\u30dd\u30fc\u30cd\u30f3\u30c8\u3067\u3059\u3002 SSD\u3084NVMe\u30c9\u30e9\u30a4\u30d6\u306a\u3069\u306e\u9ad8\u901f\u30b9\u30c8\u30ec\u30fc\u30b8\u30c7\u30d0\u30a4\u30b9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u306e\u8aad\u307f\u53d6\u308a\u3068\u66f8\u304d\u8fbc\u307f\u306e\u901f\u5ea6\u3068\u5fdc\u7b54\u6642\u9593\u3092\u6539\u5584\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u540c\u6642\u306b\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u304c\u30c7\u30fc\u30bf\u3092\u52b9\u7387\u7684\u306b\u8aad\u307f\u66f8\u304d\u3067\u304d\u308b\u3088\u3046\u306b\u3001\u30b9\u30c8\u30ec\u30fc\u30b8 \u30ea\u30bd\u30fc\u30b9\u3092\u8ce2\u304f\u5272\u308a\u5f53\u3066\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30cd\u30c3\u30c8\u30ef\u30fc\u30af: \u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u306f\u3001Spark \u30af\u30e9\u30b9\u30bf\u30fc\u3067\u306e\u30c7\u30fc\u30bf\u9001\u4fe1\u306e\u4e3b\u8981\u306a\u624b\u6bb5\u3067\u3059\u3002 \u30c7\u30fc\u30bf\u8ee2\u9001\u901f\u5ea6\u3068\u4fe1\u983c\u6027\u3092\u9ad8\u3081\u308b\u305f\u3081\u306b\u3001\u30ae\u30ac\u30d3\u30c3\u30c8\u4ee5\u4e0a\u306e\u30a4\u30fc\u30b5\u30cd\u30c3\u30c8\u3084 InfiniBand \u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u306a\u3069\u306e\u9ad8\u901f\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u63a5\u7d9a\u3092\u4f7f\u7528\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u540c\u6642\u306b\u3001\u30cd\u30c3\u30c8\u30ef\u30fc\u30af \u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u5f53\u3066\u3066\u3001\u30af\u30e9\u30b9\u30bf\u5185\u3067\u30c7\u30fc\u30bf\u3092\u8fc5\u901f\u306b\u8ee2\u9001\u3067\u304d\u308b\u3088\u3046\u306b\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30af\u30e9\u30b9\u30bf\u30fc \u30b5\u30a4\u30ba: \u30af\u30e9\u30b9\u30bf\u30fc \u30b5\u30a4\u30ba\u306f\u3001Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u6700\u9069\u5316\u306e\u3082\u3046 1 \u3064\u306e\u91cd\u8981\u306a\u8981\u7d20\u3067\u3059\u3002 \u30af\u30e9\u30b9\u30bf\u30fc\u306e\u30b5\u30a4\u30ba\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u8ca0\u8377\u3068\u30c7\u30fc\u30bf \u30b5\u30a4\u30ba\u306b\u57fa\u3065\u3044\u3066\u884c\u3046\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u30af\u30e9\u30b9\u30bf\u30fc \u30b5\u30a4\u30ba\u304c\u5c0f\u3055\u3059\u304e\u308b\u3068\u3001\u30ea\u30bd\u30fc\u30b9\u306e\u6d6a\u8cbb\u3068\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u306e\u4f4e\u4e0b\u306b\u3064\u306a\u304c\u308a\u307e\u3059\u3002 \u30af\u30e9\u30b9\u30bf\u30fc \u30b5\u30a4\u30ba\u304c\u5927\u304d\u3059\u304e\u308b\u3068\u3001\u30ea\u30bd\u30fc\u30b9\u306e\u6d6a\u8cbb\u3068\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u306e\u8907\u96d1\u3055\u304c\u5897\u3057\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u5168\u4f53\u3068\u3057\u3066\u3001\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u5074\u304b\u3089 Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u3068\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u5411\u4e0a\u3055\u305b\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 \u30e1\u30e2\u30ea\u3001CPU\u3001\u30b9\u30c8\u30ec\u30fc\u30b8\u3001\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u3001\u30af\u30e9\u30b9\u30bf\u30fc \u30b9\u30b1\u30fc\u30eb\u306a\u3069\u306e\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2 \u30b3\u30f3\u30dd\u30fc\u30cd\u30f3\u30c8\u3092\u6700\u9069\u5316\u3057\u3066\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u5411\u4e0a\u3055\u305b\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>\u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u5074\u304b\u3089Spark\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u65b9\u6cd5\u3092\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 \u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u5074\u304b\u3089 Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u3068\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u5411\u4e0a\u3055\u305b\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 \u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u306e\u89b3\u70b9\u304b\u3089 Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u305f\u3081\u306e\u63a8\u5968\u4e8b\u9805\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30c7\u30fc\u30bf\u306e\u30ed\u30fc\u30ab\u30ea\u30bc\u30fc\u30b7\u30e7\u30f3: \u30c7\u30fc\u30bf\u306e\u30ed\u30fc\u30ab\u30ea\u30bc\u30fc\u30b7\u30e7\u30f3\u306f\u3001Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u305f\u3081\u306e\u91cd\u8981\u306a\u6226\u7565\u306e 1 \u3064\u3067\u3059\u3002 \u30c7\u30fc\u30bf\u8ee2\u9001\u306e\u9045\u5ef6\u3084\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u306e\u30dc\u30c8\u30eb\u30cd\u30c3\u30af\u3092\u56de\u907f\u3059\u308b\u305f\u3081\u306b\u3001\u53ef\u80fd\u306a\u9650\u308a\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u30bf\u30b9\u30af\u3068\u540c\u3058\u30ce\u30fc\u30c9\u306b\u30c7\u30fc\u30bf\u3092\u914d\u7f6e\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 Spark \u3067\u306f\u3001\u518d\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u5316\u307e\u305f\u306f\u5408\u4f53\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u3092\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u5206\u5272\u3057\u3001\u30ad\u30e3\u30c3\u30b7\u30e5\u307e\u305f\u306f\u6c38\u7d9a\u5316\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u3092\u30ad\u30e3\u30c3\u30b7\u30e5\u3057\u3066\u3001\u5f8c\u7d9a\u306e\u30bf\u30b9\u30af\u304c\u30c7\u30fc\u30bf\u306b\u30a2\u30af\u30bb\u30b9\u3067\u304d\u308b\u3088\u3046\u306b\u3059\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u4e26\u5217\u51e6\u7406: \u4e26\u5217\u51e6\u7406\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3067\u306e\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0 \u30bf\u30b9\u30af\u306e\u4e26\u5217\u5b9f\u884c\u306e\u7a0b\u5ea6\u3067\u3059\u3002 \u4e26\u5217\u51e6\u7406\u3092\u3067\u304d\u308b\u3060\u3051\u5897\u3084\u3057\u3066\u3001\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u306e\u30b9\u30eb\u30fc\u30d7\u30c3\u30c8\u3068\u5fdc\u7b54\u6642\u9593\u3092\u5411\u4e0a\u3055\u305b\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 \u30bf\u30b9\u30af\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u3001Executor\u306e\u6570\u3001Executor\u306e\u30e1\u30e2\u30ea\u30b5\u30a4\u30ba\u306a\u3069\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u8abf\u6574\u3059\u308b\u3053\u3068\u3067\u3001\u4e26\u5217\u51e6\u7406\u306e\u6b21\u6570\u3092\u5897\u3084\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30c7\u30fc\u30bf\u5727\u7e2e: \u30c7\u30fc\u30bf\u5727\u7e2e\u306b\u3088\u308a\u3001\u30c7\u30fc\u30bf\u8ee2\u9001\u3068\u30b9\u30c8\u30ec\u30fc\u30b8\u306e\u30aa\u30fc\u30d0\u30fc\u30d8\u30c3\u30c9\u3092\u524a\u6e1b\u3057\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u5411\u4e0a\u3055\u305b\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 Gzip\u3001Snappy\u3001LZ4\u306a\u3069\u306e\u5727\u7e2e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u3092\u5727\u7e2e\u3057\u3001\u30c7\u30fc\u30bf\u3092\u3088\u308a\u52b9\u7387\u7684\u306b\u4fdd\u5b58\u304a\u3088\u3073\u9001\u4fe1\u3067\u304d\u307e\u3059\u3002 Spark \u3067\u306f\u3001spark.sql.inMemoryColumnarStorage.compressed \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u307e\u305f\u306f spark.rdd.compress \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c7\u30fc\u30bf\u5727\u7e2e\u3092\u6709\u52b9\u306b\u3059\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30b7\u30ea\u30a2\u30eb\u5316: \u30b7\u30ea\u30a2\u30eb\u5316\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3067\u306e\u30c7\u30fc\u30bf\u8ee2\u9001\u3068\u30b9\u30c8\u30ec\u30fc\u30b8\u306e\u91cd\u8981\u306a\u8981\u7d20\u306e 1 \u3064\u3067\u3059\u3002 \u30c7\u30fc\u30bf\u8ee2\u9001\u901f\u5ea6\u3068\u5fdc\u7b54\u6642\u9593\u3092\u5411\u4e0a\u3055\u305b\u308b\u305f\u3081\u306b\u3001Kryo \u3084 Java \u30b7\u30ea\u30a2\u30e9\u30a4\u30bc\u30fc\u30b7\u30e7\u30f3\u306a\u3069\u306e\u52b9\u7387\u7684\u306a\u30b7\u30ea\u30a2\u30e9\u30a4\u30bc\u30fc\u30b7\u30e7\u30f3\u5f62\u5f0f\u3092\u9078\u629e\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 Spark \u3067\u306f\u3001spark.\u30b7\u30ea\u30a2\u30e9\u30a4\u30b6\u30fc \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u4f7f\u7528\u3057\u3066\u30b7\u30ea\u30a2\u30e9\u30a4\u30b6\u30fc\u3092\u6307\u5b9a\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406: \u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30ea\u30bd\u30fc\u30b9\u5272\u308a\u5f53\u3066\u3068\u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u306b\u304a\u3051\u308b\u91cd\u8981\u306a\u8981\u7d20\u306e 1 \u3064\u3067\u3059\u3002 \u30b9\u30bf\u30f3\u30c9\u30a2\u30ed\u30f3\u3001YARN\u3001Mesos \u306a\u3069\u306e\u9069\u5207\u306a\u30ea\u30bd\u30fc\u30b9 \u30de\u30cd\u30fc\u30b8\u30e3\u30fc\u3092\u9078\u629e\u3057\u3066\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30cb\u30fc\u30ba\u3068\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2 \u30ea\u30bd\u30fc\u30b9\u306e\u5236\u7d04\u306b\u57fa\u3065\u3044\u3066\u30ea\u30bd\u30fc\u30b9\u3092\u5272\u308a\u5f53\u3066\u3066\u7ba1\u7406\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 Spark \u3067\u306f\u3001spark.executor.instances\u3001spark.executor.memory\u3001spark.executor.cores \u306a\u3069\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3068\u305d\u306e\u4ed6\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u4f7f\u7528\u3057\u3066\u3001\u30ea\u30bd\u30fc\u30b9\u306e\u5272\u308a\u5f53\u3066\u3068\u30b9\u30b1\u30b8\u30e5\u30fc\u30eb\u3092\u69cb\u6210\u3067\u304d\u307e\u3059\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u5168\u4f53\u3068\u3057\u3066\u3001\u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u5074\u304b\u3089 Spark \u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u6700\u9069\u5316\u3059\u308b\u3068\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u5411\u4e0a\u3055\u305b\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 \u30c7\u30fc\u30bf\u306e\u30ed\u30fc\u30ab\u30ea\u30bc\u30fc\u30b7\u30e7\u30f3\u3001\u4e26\u5217\u51e6\u7406\u3001\u30c7\u30fc\u30bf\u5727\u7e2e\u3001\u30b7\u30ea\u30a2\u30eb\u5316\u3001\u30ea\u30bd\u30fc\u30b9\u7ba1\u7406\u306a\u3069\u306e\u30bd\u30d5\u30c8\u30a6\u30a7\u30a2 \u30b3\u30f3\u30dd\u30fc\u30cd\u30f3\u30c8\u3092\u6700\u9069\u5316\u3057\u3066\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u5411\u4e0a\u3055\u305b\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u3092\u8abf\u6574\u3059\u308b\u65b9\u6cd5\u3092\u8a73\u3057\u304f\u8aac\u660e\u3057\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u3092\u8abf\u6574\u3059\u308b\u3068\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3068\u4fe1\u983c\u6027\u3092\u5411\u4e0a\u3055\u305b\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002 Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u69cb\u6210\u3092\u8abf\u6574\u3059\u308b\u65b9\u6cd5\u306b\u95a2\u3059\u308b\u3044\u304f\u3064\u304b\u306e\u63d0\u6848\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30e1\u30e2\u30ea\u69cb\u6210\u306e\u8abf\u6574: \u30e1\u30e2\u30ea\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u91cd\u8981\u306a\u30ea\u30bd\u30fc\u30b9\u306e 1 \u3064\u3067\u3059\u3002 \u30e1\u30e2\u30ea\u69cb\u6210\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30e1\u30e2\u30ea \u30cb\u30fc\u30ba\u3068\u4f7f\u7528\u53ef\u80fd\u306a\u30e1\u30e2\u30ea\u306e\u30b5\u30a4\u30ba\u306b\u57fa\u3065\u3044\u3066\u8abf\u6574\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 Spark \u3067\u306f\u3001spark.driver.memory \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3068 spark.executor.memory \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u4f7f\u7528\u3057\u3066\u3001\u30c9\u30e9\u30a4\u30d0\u30fc\u3068\u30a8\u30b0\u30bc\u30ad\u30e5\u30fc\u30bf\u30fc\u306e\u30e1\u30e2\u30ea\u3092\u69cb\u6210\u3067\u304d\u307e\u3059\u3002 \u3055\u3089\u306b\u3001spark.memory.fraction \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3068 spark.memory.storageFraction \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u4f7f\u7528\u3057\u3066\u3001\u3055\u307e\u3056\u307e\u306a\u7a2e\u985e\u306e\u30ef\u30fc\u30af\u30ed\u30fc\u30c9\u306b\u5408\u308f\u305b\u3066\u30d2\u30fc\u30d7\u5185\u30e1\u30e2\u30ea\u3068\u30aa\u30d5\u30d2\u30fc\u30d7 \u30e1\u30e2\u30ea\u306e\u6bd4\u7387\u3092\u8abf\u6574\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u4e26\u5217\u51e6\u7406\u306e\u69cb\u6210\u3092\u8abf\u6574\u3059\u308b: \u4e26\u5217\u51e6\u7406\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u3067\u30bf\u30b9\u30af\u304c\u4e26\u5217\u306b\u5b9f\u884c\u3055\u308c\u308b\u5ea6\u5408\u3044\u3067\u3059\u3002 \u4e26\u5217\u51e6\u7406\u306e\u69cb\u6210\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30b3\u30f3\u30d4\u30e5\u30fc\u30c6\u30a3\u30f3\u30b0\u8ca0\u8377\u3068\u30c7\u30fc\u30bf \u30b5\u30a4\u30ba\u306b\u57fa\u3065\u3044\u3066\u8abf\u6574\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 Spark \u3067\u306f\u3001spark.default.parallelism \u3084 spark.sql.shuffle.partitions \u306a\u3069\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u4f7f\u7528\u3057\u3066\u3001\u30bf\u30b9\u30af\u306e\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u5206\u5272\u3068\u4e26\u5217\u51e6\u7406\u3092\u8abf\u6574\u3067\u304d\u307e\u3059\u3002 \u3055\u3089\u306b\u3001\u30c7\u30fc\u30bf\u30fb\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u306f\u3001\u518d\u30d1\u30fc\u30c6\u30a3\u30b7\u30e7\u30f3\u5316\u307e\u305f\u306f\u5408\u4f53\u30a2\u30d7\u30ed\u30fc\u30c1\u3092\u4f7f\u7528\u3057\u3066\u3001\u3055\u307e\u3056\u307e\u306a\u30bf\u30a4\u30d7\u306e\u30ef\u30fc\u30af\u30ed\u30fc\u30c9\u306b\u5bfe\u5fdc\u3059\u308b\u3088\u3046\u306b\u8abf\u6574\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u30b9\u30b1\u30b8\u30e5\u30fc\u30eb\u69cb\u6210\u306e\u8abf\u6574: \u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u306f\u3001Spark \u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30bf\u30b9\u30af \u30b9\u30b1\u30b8\u30e5\u30fc\u30ea\u30f3\u30b0\u3068\u30ea\u30bd\u30fc\u30b9\u5272\u308a\u5f53\u3066\u306e\u91cd\u8981\u306a\u30b3\u30f3\u30dd\u30fc\u30cd\u30f3\u30c8\u306e 1 \u3064\u3067\u3059\u3002 \u30b9\u30b1\u30b8\u30e5\u30fc\u30eb\u69cb\u6210\u306f\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u30cb\u30fc\u30ba\u3068\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2 \u30ea\u30bd\u30fc\u30b9\u306e\u5236\u9650\u306b\u57fa\u3065\u3044\u3066\u8abf\u6574\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002 Spark \u3067\u306f\u3001spark.scheduler.mode \u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u4f7f\u7528\u3057\u3066\u30b9\u30b1\u30b8\u30e5\u30fc\u30eb \u30e2\u30fc\u30c9\u3092\u9078\u629e\u3057\u3001spark.executor.instances\u3001spark.executor.cores\u3001spark.executor.memory \u306a\u3069\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u30fc\u3092\u4f7f\u7528\u3057\u3066\u3001executor \u306e\u6570\u3001\u30b3\u30a2\u6570\u3001\u30e1\u30e2\u30ea\u3092\u69cb\u6210\u3067\u304d\u307e\u3059\u3002<\/li>\n\n\n\n<li>\u8c03\u6574IO\u914d\u7f6e\uff1aIO\u662fSpark\u5e94\u7528\u7a0b\u5e8f\u4e2d\u6570\u636e\u8bfb\u5199\u548c\u4f20\u8f93\u7684\u5173\u952e\u56e0\u7d20\u4e4b\u4e00\u3002\u60a8\u5e94\u8be5\u6839\u636e\u5e94\u7528\u7a0b\u5e8f\u7684\u6570\u636e\u89c4\u6a21\u548c\u6570\u636e\u5206\u5e03\u6765\u8c03\u6574IO\u914d\u7f6e\u3002\u5728Spark\u4e2d\uff0c\u53ef\u4ee5\u4f7f\u7528spark.shuffle.service.enabled\u3001spark.shuffle.file.buffer\u3001spark.reducer.maxSizeInFlight\u7b49\u53c2\u6570\u6765\u8c03\u6574\u6570\u636e\u8bfb\u5199\u548c\u4f20\u8f93\u7684\u76f8\u5173\u53c2\u6570\u3002<\/li>\n\n\n\n<li>\u8c03\u6574\u6027\u80fd\u4f18\u5316\u914d\u7f6e\uff1a\u6027\u80fd\u4f18\u5316\u662fSpark\u5e94\u7528\u7a0b\u5e8f\u4e2d\u4f18\u5316\u6267\u884c\u8ba1\u5212\u548c\u63d0\u9ad8\u8ba1\u7b97\u541e\u5410\u91cf\u7684\u5173\u952e\u7ec4\u4ef6\u4e4b\u4e00\u3002\u60a8\u5e94\u8be5\u6839\u636e\u5e94\u7528\u7a0b\u5e8f\u7684\u8ba1\u7b97\u8d1f\u8f7d\u548c\u6570\u636e\u89c4\u6a21\u6765\u8c03\u6574\u6027\u80fd\u4f18\u5316\u914d\u7f6e\u3002\u5728Spark\u4e2d\uff0c\u53ef\u4ee5\u4f7f\u7528spark.sql.autoBroadcastJoinThreshold\u3001spark.sql.shuffle.partitions\u7b49\u53c2\u6570\u6765\u4f18\u5316\u6267\u884c\u8ba1\u5212\u548c\u63d0\u9ad8\u8ba1\u7b97\u541e\u5410\u91cf\u3002<\/li>\n<\/ol>\n\n\n\n<p>\u603b\u7684\u6765\u8bf4\uff0c\u8c03\u6574Spark\u5e94\u7528\u7a0b\u5e8f\u7684\u914d\u7f6e\u53ef\u4ee5\u63d0\u9ad8\u5e94\u7528\u7a0b\u5e8f\u7684\u6027\u80fd\u548c\u53ef\u9760\u6027\u3002\u60a8\u5e94\u8be5\u4f18\u5316\u5185\u5b58\u3001\u5e76\u884c\u5ea6\u3001\u8c03\u5ea6\u3001IO\u548c\u6027\u80fd\u4f18\u5316\u7b49\u914d\u7f6e\uff0c\u4ee5\u9002\u5e94\u4e0d\u540c\u7c7b\u578b\u7684\u5de5\u4f5c\u8d1f\u8f7d\u548c\u786c\u4ef6\u8d44\u6e90\u7684\u9650\u5236\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u3092\u4f7f\u7528\u3057\u305f\u30c7\u30fc\u30bf\u51e6\u7406\u3068\u30af\u30ea\u30fc\u30cb\u30f3\u30b0\u306e\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6559\u3048\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 \u30c7\u30fc\u30bf\u51e6\u7406\u3068\u30af\u30ea\u30fc\u30cb\u30f3\u30b0\u306b Spark \u3092\u4f7f\u7528\u3059\u308b\u30d7\u30ed\u30b0\u30e9\u30e0\u4f8b\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pythonCopy code<code>from pyspark.sql import SparkSession\nfrom pyspark.sql.functions import *\n\n# \u521b\u5efaSparkSession\nspark = SparkSession.builder.appName(\"DataCleaningExample\").getOrCreate()\n\n# \u8bfb\u53d6\u6570\u636e\u6587\u4ef6\ndf = spark.read.csv(\"path\/to\/input\/data.csv\", header=True, inferSchema=True)\n\n# \u67e5\u770b\u6570\u636e\u6982\u89c8\ndf.show()\n\n# \u5220\u9664\u7a7a\u503c\u548c\u91cd\u590d\u503c\ndf = df.dropna().dropDuplicates()\n\n# \u66f4\u6539\u5217\u540d\ndf = df.withColumnRenamed(\"old_col_name\", \"new_col_name\")\n\n# \u8fc7\u6ee4\u6570\u636e\ndf = df.filter(col(\"col_name\") &gt; 0)\n\n# \u6dfb\u52a0\u65b0\u5217\ndf = df.withColumn(\"new_col\", col(\"col_name\") + 1)\n\n# \u6309\u5217\u5206\u7ec4\u6c47\u603b\u6570\u636e\ngrouped_df = df.groupBy(\"col_name\").agg(avg(\"col1\"), sum(\"col2\"))\n\n# \u4fdd\u5b58\u6e05\u6d17\u540e\u7684\u6570\u636e\u5230\u6587\u4ef6\ndf.write.csv(\"path\/to\/output\/data.csv\", header=True)\n\n# \u505c\u6b62SparkSession\nspark.stop()\n<\/code><\/pre>\n\n\n\n<p>\u4e0a\u8a18\u306e\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u3067\u306f\u3001Spark \u3092\u4f7f\u7528\u3057\u3066 CSV \u5f62\u5f0f\u306e\u5165\u529b\u30c7\u30fc\u30bf \u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u53d6\u308a\u3001\u30c7\u30fc\u30bf\u306e\u30af\u30ea\u30fc\u30cb\u30f3\u30b0\u3068\u51e6\u7406\u3092\u5b9f\u884c\u3057\u307e\u3059\u3002 \u307e\u305a\u3001\u30e1\u30bd\u30c3\u30c9\u3068\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 null \u5024\u3068\u91cd\u8907\u5024\u3092\u524a\u9664\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u5217\u540d\u3092\u5909\u66f4\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u3092\u30d5\u30a3\u30eb\u30bf\u30fc\u51e6\u7406\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u65b0\u3057\u3044\u5217\u3092\u8ffd\u52a0\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3068\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u5217\u3054\u3068\u306b\u30c7\u30fc\u30bf\u3092\u30b0\u30eb\u30fc\u30d7\u5316\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001\u3053\u306e\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30af\u30ea\u30fc\u30cb\u30f3\u30b0\u3057\u305f\u30c7\u30fc\u30bf\u3092\u51fa\u529b\u30d5\u30a1\u30a4\u30eb\u306b\u4fdd\u5b58\u3057\u307e\u3059\u3002<code>dropna()<\/code><code>dropDuplicates()<\/code><code>withColumnRenamed()<\/code><code>filter()<\/code><code>withColumn()<\/code><code>groupBy()<\/code><code>agg()<\/code><code>write.csv()<\/code><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u3092\u4f7f\u3063\u305f\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u30c7\u30fc\u30bf\u89e3\u6790\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u4f8b\u3092\u6559\u3048\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u3092\u4f7f\u7528\u3057\u305f\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0 \u30c7\u30fc\u30bf\u5206\u6790\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u4f8b\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pythonCopy code<code>from pyspark.sql import SparkSession\nfrom pyspark.sql.functions import *\nfrom pyspark.sql.types import *\n\n# \u521b\u5efaSparkSession\nspark = SparkSession.builder.appName(\"RealTimeAnalysisExample\").getOrCreate()\n\n# \u5b9a\u4e49\u6570\u636e\u6a21\u5f0f\nschema = StructType([\n    StructField(\"timestamp\", TimestampType(), True),\n    StructField(\"value\", IntegerType(), True)\n])\n\n# \u521b\u5efa\u6d41\u5f0fDataFrame\nstreaming_df = spark.readStream.schema(schema).csv(\"path\/to\/input\/streaming_data.csv\")\n\n# \u5b9e\u65f6\u5206\u6790\u6570\u636e\nresult_df = streaming_df.groupBy(window(col(\"timestamp\"), \"10 minutes\"), col(\"value\")).agg(avg(col(\"value\")), sum(col(\"value\")))\n\n# \u8f93\u51fa\u7ed3\u679c\u5230\u63a7\u5236\u53f0\nquery = result_df.writeStream.outputMode(\"complete\").format(\"console\").start()\n\n# \u7b49\u5f85\u6d41\u5f0f\u67e5\u8be2\u6267\u884c\u5b8c\u6210\nquery.awaitTermination()\n\n# \u505c\u6b62SparkSession\nspark.stop()\n<\/code><\/pre>\n\n\n\n<p>\u4e0a\u8a18\u306e\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u3067\u306f\u3001Spark \u3092\u4f7f\u7528\u3057\u3066 CSV \u5f62\u5f0f\u306e\u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u5165\u529b\u30c7\u30fc\u30bf \u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u53d6\u308a\u3001\u30ea\u30a2\u30eb\u30bf\u30a4\u30e0\u3067\u30c7\u30fc\u30bf\u3092\u5206\u6790\u3057\u307e\u3059\u3002 \u307e\u305a\u3001\u30c7\u30fc\u30bf \u30b9\u30ad\u30fc\u30de\u3092\u5b9a\u7fa9\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0 DataFrame \u3092\u4f5c\u6210\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u65b9\u6cd5\u3068\u65b9\u6cd5\u3092\u4f7f\u7528\u3057\u3066\u30bf\u30a4\u30e0 \u30a6\u30a3\u30f3\u30c9\u30a6\u3068\u5217\u3067\u30c7\u30fc\u30bf\u3092\u30b0\u30eb\u30fc\u30d7\u5316\u3057\u3001\u65b9\u6cd5\u3068\u65b9\u6cd5\u3092\u4f7f\u7528\u3057\u3066\u5e73\u5747\u3068\u5408\u8a08\u3092\u8a08\u7b97\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u7d50\u679c\u3092\u30b3\u30f3\u30bd\u30fc\u30eb\u306b\u51fa\u529b\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30af\u30a8\u30ea\u306e\u5b9f\u884c\u304c\u5b8c\u4e86\u3059\u308b\u306e\u3092\u5f85\u6a5f\u3059\u308b\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30af\u30a8\u30ea\u306e\u30b9\u30c8\u30ea\u30fc\u30df\u30f3\u30b0\u3092\u958b\u59cb\u3057\u307e\u3059\u3002<code>readStream()<\/code><code>groupBy()<\/code><code>window()<\/code><code>avg()<\/code><code>sum()<\/code><code>writeStream()<\/code><code>start()<\/code><code>awaitTermination()<\/code><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>\u6a5f\u68b0\u5b66\u7fd2\u306bSpark\u3092\u4f7f\u7528\u3059\u308b\u30d7\u30ed\u30b0\u30e9\u30e0\u306e\u4f8b\u3092\u6559\u3048\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u3092\u4f7f\u7528\u3057\u305f\u6a5f\u68b0\u5b66\u7fd2\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u4f8b\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pythonCopy code<code>from pyspark.sql import SparkSession\nfrom pyspark.ml.feature import VectorAssembler, StringIndexer\nfrom pyspark.ml.classification import LogisticRegression\nfrom pyspark.ml.evaluation import BinaryClassificationEvaluator\n\n# \u521b\u5efaSparkSession\nspark = SparkSession.builder.appName(\"MachineLearningExample\").getOrCreate()\n\n# \u8bfb\u53d6\u6570\u636e\u6587\u4ef6\ndf = spark.read.csv(\"path\/to\/input\/data.csv\", header=True, inferSchema=True)\n\n# \u6570\u636e\u6e05\u6d17\u548c\u5904\u7406\ndf = df.dropna().dropDuplicates()\nassembler = VectorAssembler(inputCols=[\"col1\", \"col2\", \"col3\"], outputCol=\"features\")\ndf = assembler.transform(df)\nlabelIndexer = StringIndexer(inputCol=\"label\", outputCol=\"indexedLabel\").fit(df)\ndf = labelIndexer.transform(df)\n\n# \u5212\u5206\u8bad\u7ec3\u96c6\u548c\u6d4b\u8bd5\u96c6\ntrain, test = df.randomSplit([0.7, 0.3], seed=12345)\n\n# \u6784\u5efa\u903b\u8f91\u56de\u5f52\u6a21\u578b\nlr = LogisticRegression(featuresCol=\"features\", labelCol=\"indexedLabel\", maxIter=10)\n\n# \u8bad\u7ec3\u6a21\u578b\nmodel = lr.fit(train)\n\n# \u9884\u6d4b\u6d4b\u8bd5\u96c6\npredictions = model.transform(test)\n\n# \u8bc4\u4f30\u6a21\u578b\u6027\u80fd\nevaluator = BinaryClassificationEvaluator(rawPredictionCol=\"rawPrediction\", labelCol=\"indexedLabel\")\nauc = evaluator.evaluate(predictions)\n\n# \u8f93\u51fa\u6a21\u578b\u6027\u80fd\u6307\u6807\nprint(\"AUC: {}\".format(auc))\n\n# \u505c\u6b62SparkSession\nspark.stop()\n<\/code><\/pre>\n\n\n\n<p>\u4e0a\u8a18\u306e\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u3067\u306f\u3001Spark \u3092\u4f7f\u7528\u3057\u3066 CSV \u5f62\u5f0f\u306e\u5165\u529b\u30c7\u30fc\u30bf \u30d5\u30a1\u30a4\u30eb\u3092\u8aad\u307f\u53d6\u308a\u3001\u30c7\u30fc\u30bf\u306e\u30af\u30ea\u30fc\u30cb\u30f3\u30b0\u3068\u51e6\u7406\u3092\u5b9f\u884c\u3057\u307e\u3059\u3002 \u307e\u305a\u3001\u30e1\u30bd\u30c3\u30c9\u3068\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 null \u5024\u3068\u91cd\u8907\u5024\u3092\u524a\u9664\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u7279\u5fb4\u5217\u3092\u7279\u5fb4\u30d9\u30af\u30bf\u30fc\u5217\u306b\u7d50\u5408\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30e9\u30d9\u30eb\u5217\u3092\u6570\u5024\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u5217\u306b\u5909\u63db\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0 \u30bb\u30c3\u30c8\u3068\u30c6\u30b9\u30c8 \u30bb\u30c3\u30c8\u306b\u5206\u5272\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30ed\u30b8\u30b9\u30c6\u30a3\u30c3\u30af\u56de\u5e30\u30e2\u30c7\u30eb\u3092\u69cb\u7bc9\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30e2\u30c7\u30eb\u3092\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30c6\u30b9\u30c8 \u30bb\u30c3\u30c8\u3092\u4e88\u6e2c\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30e2\u30c7\u30eb\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u3092\u8a55\u4fa1\u3057\u307e\u3059\u3002<code>dropna()<\/code><code>dropDuplicates()<\/code><code>VectorAssembler()<\/code><code>StringIndexer()<\/code><code>randomSplit()<\/code><code>LogisticRegression()<\/code><code>fit()<\/code><code>transform()<\/code><code>BinaryClassificationEvaluator()<\/code><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Spark\u3092\u4f7f\u3063\u305f\u30b0\u30e9\u30d5\u30a3\u30c3\u30af\u51e6\u7406\u306e\u30b5\u30f3\u30d7\u30eb\u30d7\u30ed\u30b0\u30e9\u30e0\u3092\u6559\u3048\u3066\u304f\u3060\u3055\u3044<\/p>\n\n\n\n<p>\u3082\u3061\u308d\u3093\u3067\u3059\u3002 Spark \u3092\u4f7f\u7528\u3057\u305f\u30b0\u30e9\u30d5\u30a3\u30c3\u30af\u30b9\u51e6\u7406\u306e\u30d7\u30ed\u30b0\u30e9\u30e0\u4f8b\u3092\u6b21\u306b\u793a\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pythonCopy code<code>from pyspark.sql import SparkSession\nfrom pyspark.sql.functions import *\nfrom graphframes import *\n\n# \u521b\u5efaSparkSession\nspark = SparkSession.builder.appName(\"GraphProcessingExample\").getOrCreate()\n\n# \u521b\u5efa\u8282\u70b9DataFrame\u548c\u8fb9DataFrame\nvertices = spark.createDataFrame([(1, \"A\"), (2, \"B\"), (3, \"C\"), (4, \"D\"), (5, \"E\")], [\"id\", \"name\"])\nedges = spark.createDataFrame([(1, 2), (1, 3), (2, 3), (3, 4), (4, 5), (5, 3)], [\"src\", \"dst\"])\n\n# \u521b\u5efa\u56fe\ngraph = GraphFrame(vertices, edges)\n\n# \u67e5\u8be2\u8282\u70b9\ngraph.vertices.show()\n\n# \u67e5\u8be2\u8fb9\ngraph.edges.show()\n\n# \u67e5\u8be2\u5ea6\u6570\ngraph.degrees.show()\n\n# \u67e5\u8be2\u4e09\u5143\u7ec4\ngraph.triplets.show()\n\n# \u6267\u884cPageRank\u7b97\u6cd5\nresult = graph.pageRank(resetProbability=0.15, tol=0.01)\n\n# \u8f93\u51faPageRank\u503c\nresult.vertices.show()\n\n# \u505c\u6b62SparkSession\nspark.stop()\n<\/code><\/pre>\n\n\n\n<p>\u4e0a\u8a18\u306e\u30b5\u30f3\u30d7\u30eb \u30d7\u30ed\u30b0\u30e9\u30e0\u3067\u306f\u3001\u30b0\u30e9\u30d5\u30a3\u30c3\u30af\u30b9\u51e6\u7406\u306b Spark \u3092\u4f7f\u7528\u3057\u3066\u3044\u307e\u3059\u3002 \u307e\u305a\u3001\u30ce\u30fc\u30c9 DataFrame \u3068\u30a8\u30c3\u30b8 DataFrame \u3092\u4f5c\u6210\u3057\u3001\u3053\u306e\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u30b0\u30e9\u30d5\u3092\u4f5c\u6210\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066\u3001\u30ce\u30fc\u30c9\u3001\u30a8\u30c3\u30b8\u3001\u6b21\u6570\u3001\u304a\u3088\u3073\u30c8\u30ea\u30d7\u30ec\u30c3\u30c8\u306e\u60c5\u5831\u3092\u7167\u4f1a\u3057\u307e\u3059\u3002 \u6b21\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 PageRank \u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u5b9f\u884c\u3057\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 PageRank \u5024\u3092\u51fa\u529b\u3057\u307e\u3059\u3002 \u6700\u5f8c\u306b\u3001\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u7528\u3057\u3066 SparkSession \u3092\u505c\u6b62\u3057\u307e\u3059\u3002<code>GraphFrame()<\/code><code>show()<\/code><code>pageRank()<\/code><code>show()<\/code><code>stop()<\/code><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Github\u306e\u512a\u308c\u305fSpark\u30d9\u30fc\u30b9\u306e\u30b7\u30b9\u30c6\u30e0\u306f\u4f55\u3067\u3059\u304b<\/p>\n\n\n\n<p>Github\u306b\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u3068\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u304c\u3042\u308b\u512a\u308c\u305fSpark\u30d9\u30fc\u30b9\u306e\u30b7\u30b9\u30c6\u30e0\u3092\u3044\u304f\u3064\u304b\u7d39\u4ecb\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Apache Flink: Apache Flink \u306f\u3001Spark \u306e RDD \u30e2\u30c7\u30eb\u306b\u57fa\u3065\u304f\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u306e\u5206\u6563\u30b9\u30c8\u30ea\u30fc\u30e0\u51e6\u7406\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3067\u3042\u308a\u3001\u30b9\u30c8\u30ea\u30fc\u30e0\u51e6\u7406\u3068\u30d0\u30c3\u30c1\u51e6\u7406\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u3001\u9ad8\u3044\u4fe1\u983c\u6027\u3068\u9ad8\u30b9\u30eb\u30fc\u30d7\u30c3\u30c8\u3092\u5b9f\u73fe\u3057\u307e\u3059\u3002 Flink\u306e\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u3068\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u306f\u3001https:\/\/github.com\/apache\/flink \u3067Github\u3067\u516c\u958b\u3055\u308c\u3066\u3044\u307e\u3059<a href=\"https:\/\/github.com\/apache\/flink%E3%80%82\">\u3002<\/a><\/li>\n\n\n\n<li>Apache Beam: Apache Beam\u306f\u3001Spark\u3001Flink\u3001Google Cloud Dataflow\u306a\u3069\u3001\u8907\u6570\u306e\u30d0\u30c3\u30c1\u304a\u3088\u3073\u30b9\u30c8\u30ea\u30fc\u30e0\u51e6\u7406\u30a8\u30f3\u30b8\u30f3\u3092\u30b5\u30dd\u30fc\u30c8\u3059\u308b\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u306e\u7d71\u5408\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u30e2\u30c7\u30eb\u3067\u3059\u3002 Beam\u306e\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u3068\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u306f\u3001https:\/\/github.com\/apache\/beam \u3067Github\u3067\u516c\u958b\u3055\u308c\u3066\u3044\u307e\u3059<a href=\"https:\/\/github.com\/apache\/beam%E3%80%82\">\u3002<\/a><\/li>\n\n\n\n<li>Apache Kylin: Apache Kylin \u306f\u3001Spark \u3068 Hadoop \u3092\u30d9\u30fc\u30b9\u306b\u3057\u305f\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u306e OLAP \u30a8\u30f3\u30b8\u30f3\u3067\u3001\u30cf\u30a4\u30d1\u30fc\u30b9\u30b1\u30fc\u30eb\u306e\u30c7\u30fc\u30bf\u30af\u30a8\u30ea\u3068\u5206\u6790\u3092\u9ad8\u52b9\u7387\u3067\u4f7f\u3044\u3084\u3059\u3044\u65b9\u6cd5\u3067\u30b5\u30dd\u30fc\u30c8\u3057\u307e\u3059\u3002 Kylin\u306e\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u3068\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u306f\u3001https:\/\/github.com\/apache\/kylin \u3067Github\u3067\u516c\u958b\u3055\u308c\u3066\u3044\u307e\u3059<a href=\"https:\/\/github.com\/apache\/kylin%E3%80%82\">\u3002<\/a><\/li>\n\n\n\n<li>Alluxio:Alluxio\u306f\u3001Spark\u306a\u3069\u306e\u30d3\u30c3\u30b0\u30c7\u30fc\u30bf\u51e6\u7406\u30a8\u30f3\u30b8\u30f3\u3068\u30b7\u30fc\u30e0\u30ec\u30b9\u306b\u7d71\u5408\u3059\u308b\u9ad8\u6027\u80fd\u306e\u30c7\u30fc\u30bf\u5171\u6709\u304a\u3088\u3073\u30a2\u30af\u30bb\u30b9\u6a5f\u80fd\u3092\u63d0\u4f9b\u3059\u308b\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u306e\u5206\u6563\u30d5\u30a1\u30a4\u30eb\u30b7\u30b9\u30c6\u30e0\u3067\u3059\u3002 Alluxio\u306e\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u3068\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u306f\u3001https:\/\/github.com\/alluxio\/alluxio \u3067Github\u3067\u516c\u958b\u3055\u308c\u3066\u3044\u307e\u3059<a href=\"https:\/\/github.com\/alluxio\/alluxio%E3%80%82\">\u3002<\/a><\/li>\n\n\n\n<li>DeepLearning4j: DeepLearning4j \u306f\u3001Spark \u3068 Hadoop \u306b\u57fa\u3065\u304f\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u306e\u30c7\u30a3\u30fc\u30d7\u30e9\u30fc\u30cb\u30f3\u30b0 \u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3067\u3042\u308a\u3001\u753b\u50cf\u8a8d\u8b58\u3001\u97f3\u58f0\u8a8d\u8b58\u3001\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u306a\u3069\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u5411\u3051\u306b\u8907\u6570\u306e\u30cb\u30e5\u30fc\u30e9\u30eb \u30cd\u30c3\u30c8\u30ef\u30fc\u30af \u30e2\u30c7\u30eb\u3068\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u30b5\u30dd\u30fc\u30c8\u3057\u307e\u3059\u3002 DeepLearning4j\u306e\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u3068\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u306f\u3001https:\/\/github.com\/eclipse\/deeplearning4j \u306bGithub\u3067\u516c\u958b\u3055\u308c\u3066\u3044\u307e\u3059<a href=\"https:\/\/github.com\/eclipse\/deeplearning4j%E3%80%82\">\u3002<\/a><\/li>\n<\/ol>\n\n\n\n<p>\u3053\u308c\u3089\u306e\u30b7\u30b9\u30c6\u30e0\u306f\u512a\u308c\u305fSpark\u30d9\u30fc\u30b9\u306e\u30b7\u30b9\u30c6\u30e0\u3067\u3042\u308a\u3001\u958b\u767a\u8005\u304c\u5b66\u7fd2\u3057\u3066\u4f7f\u7528\u3059\u308b\u305f\u3081\u306eGithub\u306e\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u304a\u3088\u3073\u95a2\u9023\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u3067\u3059\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/edu.ujhb.org\/?p=1153\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/chat.openai.com\/_next\/image?url=https%3A%2F%2Flh3.googleusercontent.com%2Fa%2FAGNmyxbwVdnTP0E2eBqDrujRQjV-wFP12Thr4vFgysVv5A%3Ds96-c&amp;w=64&amp;q=75\" alt=\"gun tei\"\/><\/figure>\n\n\n\n<p>Github\u3067\u306eSpark\u306e\u512a\u308c\u305f\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3068\u30b3\u30df\u30e5\u30cb\u30c6\u30a3\u306f\u4f55\u3067\u3059\u304b<\/p>\n\n\n\n<p>\u3053\u3053\u3067\u306f\u3001Github \u4e0a\u306e Spark \u306b\u95a2\u3059\u308b\u512a\u308c\u305f\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3068\u30b3\u30df\u30e5\u30cb\u30c6\u30a3\u3092\u3044\u304f\u3064\u304b\u7d39\u4ecb\u3057\u307e\u3059\u3002<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u30e9\u30fc\u30cb\u30f3\u30b0 Spark (\u516c\u5f0f\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb): \u3053\u308c\u306f Apache Spark \u304c\u63d0\u4f9b\u3059\u308b\u516c\u5f0f\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3067\u3001Spark \u306e\u57fa\u672c\u3001\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0 \u30e2\u30c7\u30eb\u3001\u30b3\u30a2\u6982\u5ff5\u3001\u304a\u3088\u3073\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b1\u30fc\u30b9\u304c\u542b\u307e\u308c\u3066\u3044\u307e\u3059\u3002 \u516c\u5f0f\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u306eGithub\u30a2\u30c9\u30ec\u30b9\u306f&nbsp;<a href=\"https:\/\/github.com\/databricks\/learning-spark%E3%80%82\">https:\/\/github.com\/databricks\/learning-spark \u3067\u3059\u3002<\/a><\/li>\n\n\n\n<li>Spark \u306e\u4f8b: \u3053\u308c\u306f\u3001Spark SQL\u3001Spark Streaming\u3001Spark MLlib\u3001Spark GraphX \u306a\u3069\u306e\u3044\u304f\u3064\u304b\u306e\u30e2\u30b8\u30e5\u30fc\u30eb\u3092\u542b\u3080 Spark \u30b5\u30f3\u30d7\u30eb \u30d7\u30ed\u30b0\u30e9\u30e0\u306e\u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3067\u3059\u3002 \u30d7\u30ed\u30b8\u30a7\u30af\u30c8\u306e Github \u30a2\u30c9\u30ec\u30b9\u306f&nbsp;<a href=\"https:\/\/github.com\/apache\/spark\/tree\/master\/examples%E3%80%82\">https:\/\/github.com\/apache\/spark\/tree\/master\/examples \u3067\u3059\u3002<\/a><\/li>\n\n\n\n<li>\u7d20\u6674\u3089\u3057\u3044 Spark: \u3053\u308c\u306f\u3001Spark \u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3001\u30c4\u30fc\u30eb\u3001\u30a2\u30d7\u30ea\u3001\u30b3\u30df\u30e5\u30cb\u30c6\u30a3\u3092\u542b\u3080 Spark \u30ea\u30bd\u30fc\u30b9\u306e\u30b3\u30ec\u30af\u30b7\u30e7\u30f3 \u30ea\u30b9\u30c8\u3067\u3059\u3002 Awesome Spark\u306eGithub\u30a2\u30c9\u30ec\u30b9\u306f&nbsp;<a href=\"https:\/\/github.com\/awesome-spark\/awesome-spark%E3%80%82\">https:\/\/github.com\/awesome-spark\/awesome-spark \u3067\u3059\u3002<\/a><\/li>\n\n\n\n<li>\u30b9\u30d1\u30fc\u30af\u30ea\u30f3\u30b0\u30a6\u30a9\u30fc\u30bf\u30fc:\u3053\u308c\u306f\u3001Spark\u3068H2O\u306b\u57fa\u3065\u304f\u30c7\u30a3\u30fc\u30d7\u30e9\u30fc\u30cb\u30f3\u30b0\u30d5\u30ec\u30fc\u30e0\u30ef\u30fc\u30af\u3067\u3042\u308a\u3001\u753b\u50cf\u8a8d\u8b58\u3001\u97f3\u58f0\u8a8d\u8b58\u3001\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u306a\u3069\u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306b\u3055\u307e\u3056\u307e\u306a\u30cb\u30e5\u30fc\u30e9\u30eb\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u30e2\u30c7\u30eb\u3068\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u63d0\u4f9b\u3057\u307e\u3059\u3002 \u30b9\u30d1\u30fc\u30af\u30ea\u30f3\u30b0\u30a6\u30a9\u30fc\u30bf\u30fc\u306eGithub\u30a2\u30c9\u30ec\u30b9\u306f&nbsp;<a href=\"https:\/\/github.com\/h2oai\/sparkling-water%E3%80%82\">https:\/\/github.com\/h2oai\/sparkling-water \u3067\u3059\u3002<\/a><\/li>\n\n\n\n<li>Spark \u30b3\u30df\u30e5\u30cb\u30c6\u30a3: \u3053\u308c\u306f Spark \u30b3\u30df\u30e5\u30cb\u30c6\u30a3\u306e\u516c\u5f0f\u30a6\u30a7\u30d6\u30b5\u30a4\u30c8\u3067\u3001Spark \u306e\u6700\u65b0\u30cb\u30e5\u30fc\u30b9\u3001\u30a4\u30d9\u30f3\u30c8\u3001\u30b3\u30df\u30e5\u30cb\u30c6\u30a3\u3001\u30ea\u30bd\u30fc\u30b9\u304c\u542b\u307e\u308c\u3066\u3044\u307e\u3059\u3002 Spark Community \u306e Github \u30a2\u30c9\u30ec\u30b9\u306f&nbsp;<a href=\"https:\/\/github.com\/apache\/spark-website%E3%80%82\">https:\/\/github.com\/apache\/spark-website \u3067\u3059\u3002<\/a><\/li>\n<\/ol>\n\n\n\n<p>\u3053\u308c\u3089\u306e\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3068\u30b3\u30df\u30e5\u30cb\u30c6\u30a3\u306f\u3001Github \u4e0a\u306e Spark \u306b\u95a2\u3059\u308b\u512a\u308c\u305f\u30ea\u30bd\u30fc\u30b9\u3067\u3042\u308a\u3001\u958b\u767a\u8005\u304c Spark \u306e\u57fa\u672c\u3001\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3 \u30b7\u30ca\u30ea\u30aa\u3001\u30d9\u30b9\u30c8 \u30d7\u30e9\u30af\u30c6\u30a3\u30b9\u3092\u3088\u308a\u3088\u304f\u7406\u89e3\u3057\u3001Spark \u306e\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u6a5f\u80fd\u3068\u958b\u767a\u30ec\u30d9\u30eb\u3092\u5411\u4e0a\u3055\u305b\u308b\u306e\u306b\u5f79\u7acb\u3061\u307e\u3059\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>IT\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u306e\u30d3\u30c7\u30aa\u3092\u3084\u308a\u305f\u3044\u306e\u3067 &hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_mi_skip_tracking":false,"footnotes":""},"categories":[23],"tags":[],"class_list":["post-1156","post","type-post","status-publish","format-standard","hentry","category-material"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=\/wp\/v2\/posts\/1156","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1156"}],"version-history":[{"count":1,"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=\/wp\/v2\/posts\/1156\/revisions"}],"predecessor-version":[{"id":1157,"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=\/wp\/v2\/posts\/1156\/revisions\/1157"}],"wp:attachment":[{"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1156"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1156"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/edu.ujhb.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1156"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}