Metaがいかにして大規模言語モデルをトレーニングしているか、その取組みを大公開

6月13日、Engineering at Metaで「[Metaが大規模言語モデルをトレーニングする方法(How Meta trains large language models at scale)](https://engineering.fb.com/2024/06/12/data-infrastructure/training-large-language-models-at-scale-meta/)」と題した記事が公開された。この記事では、AIの研究開発において直面している計算規模の大幅な拡大にどのように対処しているかについて詳しく紹介されている。

by @tf_official a year ago

6月13日、Engineering at Metaで「Metaが大規模言語モデルをトレーニングする方法(How Meta trains large language models at scale)」と題した記事が公開された。この記事では、AIの研究開発において直面している計算規模の大幅な拡大にどのように対処しているかについて詳しく紹介されている。

以下に、その内容を簡潔にまとめて紹介する。

大規模モデルのトレーニングの課題

大規模な言語モデル（LLM）のトレーニングでは、GPUの数が増えるにつれて、ハードウェアの故障による中断の可能性が高まる。これを最適に行うためには、次の4つの要素が重要である。

ハードウェアの信頼性：ハードウェア故障を最小限に抑えるため、厳格なテストと品質管理を行う。
故障時の迅速な復旧：ハードウェア故障が発生した場合、迅速に復旧する必要がある。これには、再スケジュールのオーバーヘッドを減らし、トレーニングの再初期化を迅速に行うことが含まれる。
トレーニング状態の効率的な保存：故障時に中断した場所から再開できるように、トレーニング状態を定期的にチェックポイントし、効率的に保存・取得する。
GPU間の最適な接続：大規模なモデルのトレーニングでは、大量のデータを同期して転送する必要がある。これには、高速なネットワークインフラと効率的なデータ転送プロトコルが必要である。

インフラストラクチャ全体での革新

トレーニングソフトウェア

研究者がPyTorchや他の新しいオープンソースツールを使用して、非常に高速な研究から生産への移行を実現する。これには、新しいアルゴリズムや技術の開発も含まれる。

スケジューリング

効率的なスケジューリングにより、リソースを最適に活用できる。これには、ジョブのニーズに基づいてリソースを割り当てる高度なアルゴリズムと、変化するワークロードに適応する動的スケジューリングが含まれる。

ハードウェア

大規模なモデルのトレーニングには高性能なハードウェアが必要である。Nvidia H100 GPUを使用したGrand Tetonプラットフォームを改良し、700WのTDPに対応させ、HBM3を導入した。

データセンターの展開

選定したGPUとシステムをデータセンターに最適に配置するには、電力、冷却、ネットワーキングなどのリソースを最大限に活用する必要がある。

ネットワーク

大規模モデルのトレーニングには、大量のデータを迅速に転送するための強力で高速なネットワークインフラが必要である。RoCEとInfiniBandの2種類のクラスターを構築し、それぞれの運用経験から学び、将来のGenAIファブリックの方向性を決定する。

ネットワーク通信を効率化するために、以下の3つの側面を最適化した。

通信パターンをネットワークトポロジーの異なるレイヤーに割り当て、ネットワーク機能を効果的に活用。
ネットワークトポロジーに配慮した集合通信パターンを実装し、遅延感度を低減。
ネットワーク負荷分散とルーティングへのさらなる投資により、トラフィックを最適に分配。

ストレージ

大規模なデータを効率的に保存するために、高容量・高速のストレージ技術に投資し、特定のワークロードに適した新しいデータストレージソリューションを開発。

展望

今後数年間で、数十万台のGPUを使用し、さらに大量のデータを処理し、より長距離の遅延に対処することになる。これには、新しいGPUアーキテクチャを含む新しいハードウェア技術の導入とインフラストラクチャの進化が含まれる。

詳細はHow Meta trains large language models at scaleを参照していただきたい。

18 comments

@mjbsidd69

2 months ago

You have performed a great job on this article. It’s very precise and highly qualitative. You have even managed to make it readable and easy to read. You have some real writing talent. Thank you so much. 해외선물 임대
@mjbsidd69

3 months ago

This article was written by a real thinking writer. I agree many of the with the solid points made by the writer. I’ll be back. ปั้มไลค์
@mjbsidd69

3 months ago

Attractive, post. I just stumbled upon your weblog and wanted to say that I have liked browsing your blog posts. After all, I will surely subscribe to your feed, and I hope you will write again soon! BK8
@backlinksseo234

5 months ago

This is often as a result exquisite and even very creative. Freezing take pleasure in that tones and even whomever makes the application with the -mail can be beaming. 마포 사업자 대출
@ssaudrasheed

5 months ago

It is my first visit to your blog, and I am very impressed with the articles that you serve. Give adequate knowledge for me. Thank you for sharing useful material. I will be back for the more great post. demo spaceman
@backlinksseo234

5 months ago

That is a top notch points in particular to help these fresh to blogosphere, small in addition to appropriate information… Appreciate it intended for giving this blog. Important understand document. jfm radio online
@backlinksseo234

5 months ago

the nation's certainly fabulous web log. the nation's realy informative together with a a great decent project. i want it. 大井町英会話
@ssaudrasheed

5 months ago

Thanks for the blog filled with so many information. Stopping by your blog helped me to get what I was looking for. Now my task has become as easy as ABC. iptogel
@au383112

5 months ago

Nice to read your article! I am looking forward to sharing your adventures and experiences. what is web 2.0 in seo
@backlinksseo234

5 months ago

That i taken aback when using the exploration everyone intended to get this to selected present astounding. Terrific process! 神楽坂英会話
@backlinksseo234

5 months ago

I did so appreciate reading through content articles submitted on this website. They're amazing and it has lots of helpful info. women's st pattys day hoodie
@backlinksseo234

5 months ago

I did so appreciate reading through content articles submitted on this website. They're amazing and it has lots of helpful info. Maitland personal trainers
@backlinksseo234

5 months ago

This is exactly cold content and additionally i love to read this approach content. your blog is normally terrific while you experience wonderful office personnel into your web page. excellent stating continue. Car Locksmith
@ssaudrasheed

5 months ago

A great content material as well as great layout. Your website deserves all of the positive feedback it’s been getting. I will be back soon for further quality contents. slot gacor gampang maxwin
@backlinksseo234

6 months ago

I might point out in which it is a a fantastic submit of your fantastic particular person, now i'm pleased to notice this kind of. funny thongs cheap
@backlinksseo234

6 months ago

Daily potential prospects listed here the way to with thanks for use on your endeavor, in which is the reason why We're consulting coursesmart all the time, seeking out cutting edge, unique knowledge. Various, thanks! bride to be underwear
@backlinksseo234

6 months ago

Daily potential prospects listed here the way to with thanks for use on your endeavor, in which is the reason why We're consulting coursesmart all the time, seeking out cutting edge, unique knowledge. Various, thanks! graphic design courses with fees

Address: SECOND FLOOR, E Block Rd, South Extension I, Block E, New Delhi, Delhi 110049. Phone: 093157 90731
@au383112

6 months ago

a extremely superb webpage. a realy educational in addition to a an extremely superior position. i’m a sucker for the. https://bwmovers.co.za/