<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet type="text/xsl" href="rss.xsl"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Eason's Blog Blog</title>
        <link>https://eason-projects.github.io/eason-blog/blog</link>
        <description>Eason's Blog Blog</description>
        <lastBuildDate>Mon, 24 Mar 2025 00:00:00 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <item>
            <title><![CDATA[运筹优化的7步建模方法论]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/or-7-steps</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/or-7-steps</guid>
            <pubDate>Mon, 24 Mar 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[本文介绍了如何用7步来完成针对运筹优化的建模工作。]]></description>
            <content:encoded><![CDATA[<p>本文介绍了如何用7步来完成针对运筹优化的建模工作。</p>
<p>此方法来源自《Operations Research: Applications and Algorithms》。</p>
<p>在OR书籍《Operations Research: Applications and Algorithms》中，
作者在第一章就介绍了如何用7步来进行运筹优化项目的建模，
其共分为如下7步：</p>
<ol>
<li>Formulate the Problem</li>
<li>Observe the System</li>
<li>Formulate a Mathematical Model of the Problem</li>
<li>Verify the Model and Use the Model for Prediction</li>
<li>Select a Suitable Alternative</li>
<li>Present the Results and Conclusion of the Study to the Organization</li>
<li>Implement and Evaluate Recommendations</li>
</ol>
<p>我们展开讲讲这7步：</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="1-formulate-the-problem">1. Formulate the Problem<a href="https://eason-projects.github.io/eason-blog/blog/or-7-steps#1-formulate-the-problem" class="hash-link" aria-label="Direct link to 1. Formulate the Problem" title="Direct link to 1. Formulate the Problem">​</a></h2>
<p>首先，我们需要充分的了解业务的诉求以及期望，就是我们要知道我们到底要解决的是一个什么样的问题。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="2-observe-the-system">2. Observe the System<a href="https://eason-projects.github.io/eason-blog/blog/or-7-steps#2-observe-the-system" class="hash-link" aria-label="Direct link to 2. Observe the System" title="Direct link to 2. Observe the System">​</a></h2>
<p>在我们知晓我们需要解决的是什么样的问题后，拿我们就需要查看现有系统的数据情况。
在有必要的情况下，我们还需要搭建新的系统以满足我们的业务需要。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="3-formulate-a-mathematical-model-of-the-problem">3. Formulate a Mathematical Model of the Problem<a href="https://eason-projects.github.io/eason-blog/blog/or-7-steps#3-formulate-a-mathematical-model-of-the-problem" class="hash-link" aria-label="Direct link to 3. Formulate a Mathematical Model of the Problem" title="Direct link to 3. Formulate a Mathematical Model of the Problem">​</a></h2>
<p>紧接着，我们就需要在拿到数据以及需求的情况下，针对我们的问题，通过数学的语言进行建模。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="4-verify-the-model-and-use-the-model-for-prediction">4. Verify the Model and Use the Model for Prediction<a href="https://eason-projects.github.io/eason-blog/blog/or-7-steps#4-verify-the-model-and-use-the-model-for-prediction" class="hash-link" aria-label="Direct link to 4. Verify the Model and Use the Model for Prediction" title="Direct link to 4. Verify the Model and Use the Model for Prediction">​</a></h2>
<p>随后，我们需要对我们在上一步数学建模的基础上，对模型进行验证。
同时，如果有可能的情况下，我们还需要使用模型进行预测。</p>
<p>这一步主要是观察我们的模型是否可以准确恰当的解决我们的问题。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="5-select-a-suitable-alternative">5. Select a Suitable Alternative<a href="https://eason-projects.github.io/eason-blog/blog/or-7-steps#5-select-a-suitable-alternative" class="hash-link" aria-label="Direct link to 5. Select a Suitable Alternative" title="Direct link to 5. Select a Suitable Alternative">​</a></h2>
<p>此一步，主要就是要确认我们整体要呈现给业务用户的解决方案的细节。
然后，我们需要回顾所有可能的解决方案，并选择一个最适合我们问题解决的一个方案。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="6-present-the-results-and-conclusion-of-the-study-to-the-organization">6. Present the Results and Conclusion of the Study to the Organization<a href="https://eason-projects.github.io/eason-blog/blog/or-7-steps#6-present-the-results-and-conclusion-of-the-study-to-the-organization" class="hash-link" aria-label="Direct link to 6. Present the Results and Conclusion of the Study to the Organization" title="Direct link to 6. Present the Results and Conclusion of the Study to the Organization">​</a></h2>
<p>在这一步，我们需要将我们的解决方案呈现给业务，并获得业务的认可。</p>
<p>如果业务对我们最终的方案无法认可，或者又指出不足之处，那么我们就需要从头开始，审视我们完成的过程中是否忽略了某些重要的条件或者限制等，也就需要重新对方案进行修改和设计。</p>
<p>如果我们的方案得到了业务的认可，那么我们就可以推进部署和落地的事项。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="7-implement-and-evaluate-recommendations">7. Implement and Evaluate Recommendations<a href="https://eason-projects.github.io/eason-blog/blog/or-7-steps#7-implement-and-evaluate-recommendations" class="hash-link" aria-label="Direct link to 7. Implement and Evaluate Recommendations" title="Direct link to 7. Implement and Evaluate Recommendations">​</a></h2>
<p>最后一步，就是要实际部署我们的方案，我们需要在这一步，将结果推到生产环境，并持续不断的对我们的方案进行数据的收集以及模型准确性的持续跟踪和监控。</p>
<p>以确保我们的方案，可以及时捕获新的改动以及异常，以确保方案可以持续不断的服务业务用户。
以满足业务需要。</p>
<hr>
<p>总结下来，我们在解决运筹优化，以及其他类型的数据或者算法项目的过程中，可以参考相关的流程。在深入分析，并彻底了解问题的基础上，通过真实的数据进行方案的设计以及验证。
最后在通过业务的审核后，上线并持续运行监控我们方案的效果。</p>
<p>以上就是运筹优化领域的7步建模方法。</p>]]></content:encoded>
            <category>Operation Research</category>
        </item>
        <item>
            <title><![CDATA[Gurobi 101]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/gurobi</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/gurobi</guid>
            <pubDate>Thu, 13 Mar 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[本文介绍了如何熟悉Gurobi，并展开学习。]]></description>
            <content:encoded><![CDATA[<p>本文介绍了如何熟悉Gurobi，并展开学习。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="安装">安装<a href="https://eason-projects.github.io/eason-blog/blog/gurobi#%E5%AE%89%E8%A3%85" class="hash-link" aria-label="Direct link to 安装" title="Direct link to 安装">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="通过docker使用gurobi">通过Docker使用Gurobi<a href="https://eason-projects.github.io/eason-blog/blog/gurobi#%E9%80%9A%E8%BF%87docker%E4%BD%BF%E7%94%A8gurobi" class="hash-link" aria-label="Direct link to 通过Docker使用Gurobi" title="Direct link to 通过Docker使用Gurobi">​</a></h3>
<p>我们可以运行如下命令启动基于Docker的Python JupyterLab的运行环境。
该环境默认提供了多种的Python Notebook案例，我们可以择取来学习Gurobi。</p>
<div class="language-bash codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-bash codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">docker run -p 10888:8888 gurobi/modeling-examples</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>运行命令后，我们打开：
<a href="http://localhost:10888/lab" target="_blank" rel="noopener noreferrer">http://localhost:10888/lab</a>来查看JupyterLab环境。
如下图所示：</p>
<p><img decoding="async" loading="lazy" alt="Jupyter landing page" src="https://eason-projects.github.io/eason-blog/assets/images/jupyter-landing-page-397fdc6693dc0e6e13085844358d96ab.png" width="2588" height="1752" class="img_e6Vo"></p>
<p>我们打开任意一个文件夹，即可开始进行Gurobi的案例研究。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="二进制整数规划">二进制整数规划<a href="https://eason-projects.github.io/eason-blog/blog/gurobi#%E4%BA%8C%E8%BF%9B%E5%88%B6%E6%95%B4%E6%95%B0%E8%A7%84%E5%88%92" class="hash-link" aria-label="Direct link to 二进制整数规划" title="Direct link to 二进制整数规划">​</a></h2>
<p>我们首先来尝试解决一个非常经典的<a href="https://zh.wikipedia.org/wiki/%E8%83%8C%E5%8C%85%E9%97%AE%E9%A2%98" target="_blank" rel="noopener noreferrer">背包问题</a>。</p>
<p>假设，我们要去旅行，要携带一些物品，而这些物品有重量和对于我们的价值。
那么我们如果用一个背包来携带这些物品的话，由于背包的容量限制，
导致我们必须最大化的选择一些物品来携带。</p>
<p>假设我们有如下的物品：</p>
<table><thead><tr><th>物品名称</th><th>重量</th><th>价值</th></tr></thead><tbody><tr><td>手电筒</td><td>1</td><td>5</td></tr><tr><td>睡袋</td><td>3</td><td>12</td></tr><tr><td>食物</td><td>2</td><td>8</td></tr><tr><td>水</td><td>4</td><td>15</td></tr></tbody></table>
<p>那么我们的思路是，通过Gurobi来定义变量（Variables）、限制条件（Constraints）以及优化目标（Objective），
来找到针对上面的问题最优的答案。</p>
<p>首先，我们来创建一个Gurobi Python模型：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># Import Gurobi Python </span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> gurobipy </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> gp</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"Knapsack"</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>然后我们将上述问题的一些参数进行定义，比如物品的名称、重量和价值等：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># Set items with weight and value</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">items </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'flashlight'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">'weight'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'value'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'sleeping_bag'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">'weight'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'value'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">12</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'food'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">'weight'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'value'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">8</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'water'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">'weight'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">4</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'value'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">15</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Set the max capacity</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">CAPACITY </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">7</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>接着，我们给Gurobi模型添加相关的变量，我们通过二进制的表示（1是携带该物品，0是不携带该物品）来表示最后的携带状态。</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># Add variables</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">x </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addVars</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">items</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">keys</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> vtype</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">GRB</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">BINARY</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"select"</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>现在我们需要添加上述问题中的限制条件，比如所选择的物品不能超过我们背包可容纳的上限。</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addConstr</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">quicksum</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">items</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'weight'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> x</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> i </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> items</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&lt;=</span><span class="token plain"> CAPACITY</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"capacity"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>最后，我们需要添加我们所期望的优化目标，也就是我们的Objective。</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">setObjective</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">quicksum</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">items</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'value'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> x</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> i </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> items</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">GRB</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">MAXIMIZE</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>上述内容添加后，我们就可以开始优化求解的流程了。执行：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">optimize</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>会触发优化过程，其输出结果如下所示：</p>
<div class="language-plaintext codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-plaintext codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (linux64 - "Debian GNU/Linux 11 (bullseye)")</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">CPU model: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz, instruction set [SSE2|AVX|AVX2]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Thread count: 8 physical cores, 8 logical processors, using up to 8 threads</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Optimize a model with 1 rows, 4 columns and 4 nonzeros</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Model fingerprint: 0xc569c671</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Variable types: 0 continuous, 4 integer (4 binary)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Coefficient statistics:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  Matrix range     [1e+00, 4e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  Objective range  [5e+00, 2e+01]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  Bounds range     [1e+00, 1e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  RHS range        [7e+00, 7e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Found heuristic solution: objective 25.0000000</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Presolve removed 1 rows and 4 columns</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Presolve time: 0.00s</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Presolve: All rows and columns removed</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Explored 0 nodes (0 simplex iterations) in 0.05 seconds (0.00 work units)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Thread count was 1 (of 8 available processors)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Solution count 2: 28 25 </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Optimal solution found (tolerance 1.00e-04)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Best objective 2.800000000000e+01, best bound 2.800000000000e+01, gap 0.0000%</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>我们可以通过如下的检查语句来查看模型最后结果的细节内容：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># Get model's status, gp.GRB.OPTIMAL == 2</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">status</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Get the objective value</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">objVal</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Print each variables' value</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> i </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">getVars</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f'</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">i</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">.</span><span class="token string-interpolation interpolation">varName</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">, </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">i</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">.</span><span class="token string-interpolation interpolation">x</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">'</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>运行后，我们可以看到我们的模型最优结果是28，是如下的物品组合：</p>
<div class="language-plaintext codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-plaintext codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">select[flashlight], 1.0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">select[sleeping_bag], 0.0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">select[food], 1.0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">select[water], 1.0</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>也就是，需要携带手电筒、食物和水。不携带睡袋。</p>
<p>至此，我们的求解工作完成。
我们实现了一个非常简单的运筹优化求解的动作。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="整数规划">整数规划<a href="https://eason-projects.github.io/eason-blog/blog/gurobi#%E6%95%B4%E6%95%B0%E8%A7%84%E5%88%92" class="hash-link" aria-label="Direct link to 整数规划" title="Direct link to 整数规划">​</a></h2>
<p>在<a href="https://x.com/rainmaker1973/status/1901639842150125637" target="_blank" rel="noopener noreferrer">社交媒体上</a>看到这样一个问题：</p>
<p><img decoding="async" loading="lazy" alt="Area of blue rectangle" src="https://eason-projects.github.io/eason-blog/assets/images/blue-rectangle-17a0e7ae7fefc5029b37881a77f09677.png" width="668" height="416" class="img_e6Vo"></p>
<p>根据已知条件求解蓝色区域的面积。</p>
<p>很显然，我们可以通过Gurobi来求解，具体代码如下：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># Import Gurobi</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> gurobipy </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> gp</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Create a new model</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">m </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">Model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"Size"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Define variables</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">height </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addVar</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">vtype</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">GRB</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">INTEGER</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"height"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">width </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addVar</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">vtype</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">GRB</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">INTEGER</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"width"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">x1 </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addVar</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">vtype</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">GRB</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">INTEGER</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"x1"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">x2 </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addVar</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">vtype</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">GRB</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">INTEGER</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"x2"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Define constraints</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addConstr</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">x1 </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> width </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">7</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"7m"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addConstr</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">x2 </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> width </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">8</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"8m"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addConstr</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">x1 </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> height </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">20</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"20m2"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">addConstr</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">x2 </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> height </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">25</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"25m2"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Define objective</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">setObjective</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">height </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> width</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> gp</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">GRB</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">MAXIMIZE</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Solve</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">optimize</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Print values</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">getVars</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>我们模型输出的结果是：</p>
<div class="language-plaintext codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-plaintext codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (linux64 - "Debian GNU/Linux 11 (bullseye)")</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">CPU model: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz, instruction set [SSE2|AVX|AVX2]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Thread count: 8 physical cores, 8 logical processors, using up to 8 threads</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Optimize a model with 2 rows, 4 columns and 4 nonzeros</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Model fingerprint: 0x9c0c59c0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Model has 1 quadratic objective term</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Model has 2 quadratic constraints</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Variable types: 0 continuous, 4 integer (0 binary)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Coefficient statistics:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  Matrix range     [1e+00, 1e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  QMatrix range    [1e+00, 1e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  Objective range  [0e+00, 0e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  QObjective range [2e+00, 2e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  Bounds range     [0e+00, 0e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  RHS range        [7e+00, 8e+00]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  QRHS range       [2e+01, 2e+01]</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Presolve time: 0.00s</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Presolved: 12 rows, 6 columns, 25 nonzeros</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Presolved model has 3 bilinear constraint(s)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Solving non-convex MIQCP</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Variable types: 2 continuous, 4 integer (0 binary)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Root relaxation: objective 1.500000e+01, 0 iterations, 0.00 seconds (0.00 work units)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    Nodes    |    Current Node    |     Objective Bounds      |     Work</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">*    0     0               0      15.0000000   15.00000  0.00%     -    0s</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Explored 1 nodes (0 simplex iterations) in 0.09 seconds (0.00 work units)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Thread count was 8 (of 8 available processors)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Solution count 1: 15 </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Optimal solution found (tolerance 1.00e-04)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Best objective 1.500000000000e+01, best bound 1.500000000000e+01, gap 0.0000%</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>同时变量数值为：</p>
<div class="language-plaintext codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-plaintext codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">[&lt;gurobi.Var height (value 5.0)&gt;,</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> &lt;gurobi.Var width (value 3.0)&gt;,</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> &lt;gurobi.Var x1 (value 4.0)&gt;,</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> &lt;gurobi.Var x2 (value 5.0)&gt;]</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>因此，蓝色区域的面积为 <strong>3x5 = 15</strong>平方米。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="分支定界算法">分支定界算法<a href="https://eason-projects.github.io/eason-blog/blog/gurobi#%E5%88%86%E6%94%AF%E5%AE%9A%E7%95%8C%E7%AE%97%E6%B3%95" class="hash-link" aria-label="Direct link to 分支定界算法" title="Direct link to 分支定界算法">​</a></h2>
<p>首先对问题进行定义，然后通过线性规划松弛（Linear Programming Relaxation）来获得小数解（作为初始的上界或者下界）。
松弛的含义就是暂时去掉结果必须是整数的限制，先通过小数来求一个在当前可行域内的全局最优小数解，作为后续分支的参考界限。</p>
<p>然后，通过分支将问题拆解为多个子问题，针对某个变量的取值添加整数约束，从而缩小解空间，探索可能的整数解。
依次遍历部分子问题，并通过剪枝（Pruning），提前终止那些目标函数值比当前已知界更差的分支，以减少计算量。</p>
<p>通过不断更新上界和下界，逐步收敛，直到找到全局最优的整数解，或者所有分支都被探索或剪枝，算法终止。</p>
<p><img decoding="async" loading="lazy" alt="Branch and Bound Algorithm Example" src="https://eason-projects.github.io/eason-blog/assets/images/branch-and-bound-youtube-screenshot-5beeedd05dca0036df8cfae3ed6ac030.png" width="2390" height="1370" class="img_e6Vo">
<em><a href="https://youtu.be/cEcS13Ku1i8?t=516" target="_blank" rel="noopener noreferrer">Branch and bound algorithm example</a></em></p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="参考资料">参考资料<a href="https://eason-projects.github.io/eason-blog/blog/gurobi#%E5%8F%82%E8%80%83%E8%B5%84%E6%96%99" class="hash-link" aria-label="Direct link to 参考资料" title="Direct link to 参考资料">​</a></h2>
<ul>
<li>Gurobi官方入门指南: <a href="https://support.gurobi.com/hc/en-us/articles/14799677517585-Getting-Started-with-Gurobi-Optimizer" target="_blank" rel="noopener noreferrer">https://support.gurobi.com/hc/en-us/articles/14799677517585-Getting-Started-with-Gurobi-Optimizer</a></li>
<li>Branch and bound algorithm example: <a href="https://www.youtube.com/watch?v=cEcS13Ku1i8" target="_blank" rel="noopener noreferrer">https://www.youtube.com/watch?v=cEcS13Ku1i8</a></li>
</ul>]]></content:encoded>
            <category>Operation Research</category>
        </item>
        <item>
            <title><![CDATA[使用Python检测蓝牙信号]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/ble-beacon</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/ble-beacon</guid>
            <pubDate>Sun, 09 Mar 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[本文介绍了如何使用Python在MacOS上检测BLE信号并可视化展示信号强度。]]></description>
            <content:encoded><![CDATA[<p>本文介绍了如何使用Python在MacOS上检测BLE信号并可视化展示信号强度。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="常见的定位方法">常见的定位方法<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%B8%B8%E8%A7%81%E7%9A%84%E5%AE%9A%E4%BD%8D%E6%96%B9%E6%B3%95" class="hash-link" aria-label="Direct link to 常见的定位方法" title="Direct link to 常见的定位方法">​</a></h2>
<p>在设备定位的领域内，有大概3种定位技术，其分别为：UWB（超宽带）、BLE（低功耗蓝牙）和WiFi。这三种技术各有优缺点，适用于不同的场景。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="uwbultra-wideband超宽带定位">UWB（Ultra-Wideband，超宽带）定位<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#uwbultra-wideband%E8%B6%85%E5%AE%BD%E5%B8%A6%E5%AE%9A%E4%BD%8D" class="hash-link" aria-label="Direct link to UWB（Ultra-Wideband，超宽带）定位" title="Direct link to UWB（Ultra-Wideband，超宽带）定位">​</a></h3>
<p>UWB是一种使用极短脉冲在宽频带上传输数据的无线通信技术。在定位领域，UWB具有以下特点：</p>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="工作原理">工作原理<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%B7%A5%E4%BD%9C%E5%8E%9F%E7%90%86" class="hash-link" aria-label="Direct link to 工作原理" title="Direct link to 工作原理">​</a></h4>
<p>UWB定位主要基于TOF（Time of Flight，飞行时间）或TDOA（Time Difference of Arrival，到达时间差）原理。设备通过测量无线电信号从发射到接收的时间来计算距离，进而确定位置。</p>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="优势">优势<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E4%BC%98%E5%8A%BF" class="hash-link" aria-label="Direct link to 优势" title="Direct link to 优势">​</a></h4>
<ul>
<li><strong>高精度</strong>：UWB可以提供厘米级的定位精度（通常在10-30厘米范围内）</li>
<li><strong>抗多径干扰</strong>：由于使用极短的脉冲，UWB对多径干扰有很强的抵抗力</li>
<li><strong>穿墙能力强</strong>：UWB信号可以穿透墙壁和其他障碍物</li>
<li><strong>低功耗</strong>：相对于其他高精度定位技术，UWB的功耗较低</li>
</ul>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="劣势">劣势<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%8A%A3%E5%8A%BF" class="hash-link" aria-label="Direct link to 劣势" title="Direct link to 劣势">​</a></h4>
<ul>
<li><strong>成本高</strong>：UWB设备和基础设施的成本相对较高</li>
<li><strong>覆盖范围有限</strong>：通常需要多个锚点（基站）来覆盖较大区域</li>
<li><strong>标准化程度较低</strong>：虽然有IEEE 802.15.4z标准，但市场上的实现多样化</li>
</ul>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="应用场景">应用场景<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%BA%94%E7%94%A8%E5%9C%BA%E6%99%AF" class="hash-link" aria-label="Direct link to 应用场景" title="Direct link to 应用场景">​</a></h4>
<ul>
<li>高精度室内定位</li>
<li>资产追踪</li>
<li>智能家居</li>
<li>工业自动化</li>
<li>车辆防盗系统（如Apple AirTag等）</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="blebluetooth-low-energy低功耗蓝牙定位">BLE（Bluetooth Low Energy，低功耗蓝牙）定位<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#blebluetooth-low-energy%E4%BD%8E%E5%8A%9F%E8%80%97%E8%93%9D%E7%89%99%E5%AE%9A%E4%BD%8D" class="hash-link" aria-label="Direct link to BLE（Bluetooth Low Energy，低功耗蓝牙）定位" title="Direct link to BLE（Bluetooth Low Energy，低功耗蓝牙）定位">​</a></h3>
<p>BLE是蓝牙技术的一个子集，专为低功耗应用设计。在定位领域，BLE主要通过信标（Beacon）技术实现。</p>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="工作原理-1">工作原理<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%B7%A5%E4%BD%9C%E5%8E%9F%E7%90%86-1" class="hash-link" aria-label="Direct link to 工作原理" title="Direct link to 工作原理">​</a></h4>
<p>BLE定位主要基于RSSI（Received Signal Strength Indication，接收信号强度指示）。通过测量接收到的信号强度，并结合路径损耗模型，可以估算设备与信标之间的距离。常见的协议包括iBeacon（苹果）和Eddystone（谷歌）。</p>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="优势-1">优势<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E4%BC%98%E5%8A%BF-1" class="hash-link" aria-label="Direct link to 优势" title="Direct link to 优势">​</a></h4>
<ul>
<li><strong>低功耗</strong>：BLE设备可以使用纽扣电池运行数月甚至数年</li>
<li><strong>成本低</strong>：BLE芯片和信标价格便宜，部署成本低</li>
<li><strong>兼容性好</strong>：几乎所有现代智能手机都支持BLE</li>
<li><strong>部署简单</strong>：无需复杂的基础设施</li>
</ul>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="劣势-1">劣势<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%8A%A3%E5%8A%BF-1" class="hash-link" aria-label="Direct link to 劣势" title="Direct link to 劣势">​</a></h4>
<ul>
<li><strong>精度有限</strong>：典型精度在3-5米，受环境影响大</li>
<li><strong>易受干扰</strong>：信号容易受到人体、墙壁等障碍物的影响</li>
<li><strong>距离有限</strong>：有效范围通常在50米以内</li>
</ul>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="应用场景-1">应用场景<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%BA%94%E7%94%A8%E5%9C%BA%E6%99%AF-1" class="hash-link" aria-label="Direct link to 应用场景" title="Direct link to 应用场景">​</a></h4>
<ul>
<li>商场导航</li>
<li>展览会信息推送</li>
<li>资产追踪</li>
<li>考勤系统</li>
<li>智能家居自动化</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="wifi定位">WiFi定位<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#wifi%E5%AE%9A%E4%BD%8D" class="hash-link" aria-label="Direct link to WiFi定位" title="Direct link to WiFi定位">​</a></h3>
<p>WiFi定位利用现有的WiFi基础设施进行室内定位，是最广泛部署的室内定位技术之一。</p>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="工作原理-2">工作原理<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%B7%A5%E4%BD%9C%E5%8E%9F%E7%90%86-2" class="hash-link" aria-label="Direct link to 工作原理" title="Direct link to 工作原理">​</a></h4>
<p>WiFi定位主要有两种方式：</p>
<ol>
<li><strong>基于RSSI的三边测量</strong>：通过测量设备与多个WiFi接入点之间的信号强度，估算距离并确定位置</li>
<li><strong>指纹定位</strong>：预先采集空间中各点的WiFi信号特征，形成"指纹数据库"，定位时将实时采集的信号与数据库匹配</li>
</ol>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="优势-2">优势<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E4%BC%98%E5%8A%BF-2" class="hash-link" aria-label="Direct link to 优势" title="Direct link to 优势">​</a></h4>
<ul>
<li><strong>基础设施广泛</strong>：利用现有WiFi网络，无需额外硬件</li>
<li><strong>覆盖范围大</strong>：单个接入点可覆盖数十米范围</li>
<li><strong>成本低</strong>：如果已有WiFi网络，几乎无额外成本</li>
<li><strong>兼容性好</strong>：几乎所有移动设备都支持WiFi</li>
</ul>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="劣势-2">劣势<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%8A%A3%E5%8A%BF-2" class="hash-link" aria-label="Direct link to 劣势" title="Direct link to 劣势">​</a></h4>
<ul>
<li><strong>精度一般</strong>：典型精度在3-15米，取决于环境和接入点密度</li>
<li><strong>易受干扰</strong>：信号受环境变化影响大</li>
<li><strong>功耗较高</strong>：相比BLE和UWB，WiFi的功耗较高</li>
<li><strong>初始化复杂</strong>：指纹定位需要前期大量采集工作</li>
</ul>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="应用场景-2">应用场景<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%BA%94%E7%94%A8%E5%9C%BA%E6%99%AF-2" class="hash-link" aria-label="Direct link to 应用场景" title="Direct link to 应用场景">​</a></h4>
<ul>
<li>大型建筑物内导航</li>
<li>商场客流分析</li>
<li>公共场所位置服务</li>
<li>资产管理</li>
<li>智能办公</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="三种技术对比">三种技术对比<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E4%B8%89%E7%A7%8D%E6%8A%80%E6%9C%AF%E5%AF%B9%E6%AF%94" class="hash-link" aria-label="Direct link to 三种技术对比" title="Direct link to 三种技术对比">​</a></h3>
<table><thead><tr><th>技术</th><th>精度</th><th>功耗</th><th>成本</th><th>覆盖范围</th><th>抗干扰能力</th></tr></thead><tbody><tr><td>UWB</td><td>10-30厘米</td><td>中等</td><td>高</td><td>小（~50米）</td><td>强</td></tr><tr><td>BLE</td><td>3-5米</td><td>低</td><td>低</td><td>中（~50米）</td><td>弱</td></tr><tr><td>WiFi</td><td>3-15米</td><td>高</td><td>低（利用现有网络）</td><td>大（~100米）</td><td>中</td></tr></tbody></table>
<p>在实际应用中，这三种技术往往会结合使用，以弥补各自的不足。例如，可以使用WiFi进行粗略定位，然后使用BLE进行区域确认，最后在需要高精度的场景下使用UWB进行精确定位。</p>
<p>接下来，我们将重点介绍如何使用Python检测BLE信号并可视化展示信号强度。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="python检测ble信号">Python检测BLE信号<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#python%E6%A3%80%E6%B5%8Bble%E4%BF%A1%E5%8F%B7" class="hash-link" aria-label="Direct link to Python检测BLE信号" title="Direct link to Python检测BLE信号">​</a></h2>
<p>在本节中，我们将介绍如何使用Python来检测和分析BLE信号。我们将基于<a href="https://github.com/yishi-projects/ble-beacon" target="_blank" rel="noopener noreferrer">yishi-projects/ble-beacon</a>项目中的代码来实现这一功能。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="所需库和依赖">所需库和依赖<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E6%89%80%E9%9C%80%E5%BA%93%E5%92%8C%E4%BE%9D%E8%B5%96" class="hash-link" aria-label="Direct link to 所需库和依赖" title="Direct link to 所需库和依赖">​</a></h3>
<p>首先，我们需要安装以下Python库：</p>
<div class="language-bash codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-bash codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">pip install bleak kafka-python</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>主要依赖包括：</p>
<ul>
<li><strong>bleak</strong>：跨平台的BLE客户端库，支持Windows、macOS和Linux</li>
<li><strong>kafka-python</strong>：用于将数据发送到Kafka（可选，用于数据流处理）</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="代码结构">代码结构<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E4%BB%A3%E7%A0%81%E7%BB%93%E6%9E%84" class="hash-link" aria-label="Direct link to 代码结构" title="Direct link to 代码结构">​</a></h3>
<p>我们的BLE检测程序主要包含以下几个部分：</p>
<ol>
<li>初始化和配置</li>
<li>BLE设备扫描</li>
<li>信标数据解析（iBeacon、Eddystone等）</li>
<li>数据处理和可视化</li>
</ol>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="初始化和配置">初始化和配置<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%88%9D%E5%A7%8B%E5%8C%96%E5%92%8C%E9%85%8D%E7%BD%AE" class="hash-link" aria-label="Direct link to 初始化和配置" title="Direct link to 初始化和配置">​</a></h3>
<p>首先，我们需要导入必要的库并设置基本配置：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> asyncio</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> bleak </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> BleakScanner</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> uuid</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> time</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> datetime</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> os</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> configparser</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># 全局变量控制扫描状态</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">_scanning_active </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">False</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># 加载配置文件</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">load_config</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:#e3116c">"""从~/.ble/config.conf加载配置，如果不存在则创建默认配置"""</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    config </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> configparser</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">ConfigParser</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 默认配置</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    config</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'kafka'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">'broker'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'localhost:9092'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">'topic'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'ble_beacons'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 创建配置目录（如果不存在）</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    config_dir </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> os</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">expanduser</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"~/.ble"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    os</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">makedirs</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config_dir</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> exist_ok</span><span class="token operator" style="color:#393A34">=</span><span class="token boolean" style="color:#36acaa">True</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    config_file </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> os</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">join</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config_dir</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"config.conf"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 如果配置文件存在，读取它</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> os</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">exists</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config_file</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        config</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">read</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config_file</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">else</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># 创建默认配置文件</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> </span><span class="token builtin">open</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config_file</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'w'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> f</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            config</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">write</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">f</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> config</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="ble设备扫描">BLE设备扫描<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#ble%E8%AE%BE%E5%A4%87%E6%89%AB%E6%8F%8F" class="hash-link" aria-label="Direct link to BLE设备扫描" title="Direct link to BLE设备扫描">​</a></h3>
<p>BLE设备扫描是整个程序的核心部分。我们使用<code>bleak</code>库的<code>BleakScanner</code>来异步扫描周围的BLE设备：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">async</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">scan_ble_devices</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:#e3116c">"""扫描BLE设备并处理信标数据"""</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 获取主机ID（用于数据标识）</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    host_id </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> get_host_id</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 计数器（用于日志）</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    scan_count </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 设置扫描状态</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">global</span><span class="token plain"> _scanning_active</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    _scanning_active </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">True</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">try</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># 持续扫描循环</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">while</span><span class="token plain"> _scanning_active</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            scan_count </span><span class="token operator" style="color:#393A34">+=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># 扫描设备（超时1秒）</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            devices </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">await</span><span class="token plain"> BleakScanner</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">discover</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">timeout</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">1.0</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># 检查是否应该停止扫描</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> _scanning_active</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">break</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># 处理每个设备</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            beacons_found </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> device </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> devices</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># 提取制造商数据</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">metadata</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'manufacturer_data'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> company_code</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> data </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">metadata</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'manufacturer_data'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">items</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token comment" style="color:#999988;font-style:italic"># 检测不同类型的信标</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        process_beacon_data</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">company_code</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># 等待下一次扫描</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">await</span><span class="token plain"> asyncio</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">sleep</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">finally</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># 清理资源</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">pass</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="信标数据解析">信标数据解析<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E4%BF%A1%E6%A0%87%E6%95%B0%E6%8D%AE%E8%A7%A3%E6%9E%90" class="hash-link" aria-label="Direct link to 信标数据解析" title="Direct link to 信标数据解析">​</a></h3>
<p>BLE信标有多种类型，最常见的是iBeacon（苹果）和Eddystone（谷歌）。我们需要根据不同的协议格式解析数据：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">process_beacon_data</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">company_code</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:#e3116c">"""根据不同的信标类型解析数据"""</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 检查iBeacon（苹果公司代码是0x004C）</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> company_code </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x004C</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">23</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">try</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># 检查iBeacon标识符（0x02, 0x15）</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x02</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x15</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># 解析iBeacon数据</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                uuid_bytes </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">18</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                uuid_str </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">str</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">uuid</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">UUID</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">bytes</span><span class="token operator" style="color:#393A34">=</span><span class="token builtin">bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">uuid_bytes</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                major </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">int</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">from_bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">18</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">20</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> byteorder</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'big'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                minor </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">int</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">from_bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">20</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">22</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> byteorder</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'big'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                tx_power </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">22</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">256</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">22</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">127</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">else</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">22</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                beacon_data </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token string" style="color:#e3116c">'uuid'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> uuid_str</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token string" style="color:#e3116c">'major'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> major</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token string" style="color:#e3116c">'minor'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> minor</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token string" style="color:#e3116c">'tx_power'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tx_power</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token string" style="color:#e3116c">'rssi'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rssi</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token string" style="color:#e3116c">'address'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">address</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token string" style="color:#e3116c">'name'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">name </span><span class="token keyword" style="color:#00009f">or</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'Unknown'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># 处理iBeacon数据</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                process_beacon</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'iBeacon'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> beacon_data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">except</span><span class="token plain"> Exception </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> e</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"处理iBeacon数据时出错: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">e</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 检查Eddystone信标（谷歌公司代码是0x00AA）</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">elif</span><span class="token plain"> company_code </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x00AA</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">20</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">try</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># 检查Eddystone标识符</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0xAA</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0xFE</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                frame_type </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># Eddystone-UID</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> frame_type </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x00</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    namespace </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">3</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">13</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token builtin">hex</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    instance </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">13</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">19</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token builtin">hex</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    beacon_data </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'namespace'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> namespace</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'instance'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> instance</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'rssi'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rssi</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'address'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">address</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'name'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">name </span><span class="token keyword" style="color:#00009f">or</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'Unknown'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token comment" style="color:#999988;font-style:italic"># 处理Eddystone-UID数据</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    process_beacon</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'Eddystone-UID'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> beacon_data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># Eddystone-URL</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">elif</span><span class="token plain"> frame_type </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x10</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    url_scheme </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'http://www.'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'https://www.'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'http://'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'https://'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">3</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    url_data </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">4</span><span class="token punctuation" style="color:#393A34">:</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">decode</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'ascii'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    url </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> url_scheme </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> url_data</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    beacon_data </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'url'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> url</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'rssi'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rssi</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'address'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">address</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token string" style="color:#e3116c">'name'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">name </span><span class="token keyword" style="color:#00009f">or</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'Unknown'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token comment" style="color:#999988;font-style:italic"># 处理Eddystone-URL数据</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    process_beacon</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'Eddystone-URL'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> beacon_data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">except</span><span class="token plain"> Exception </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> e</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"处理Eddystone数据时出错: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">e</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="数据处理和可视化">数据处理和可视化<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E6%95%B0%E6%8D%AE%E5%A4%84%E7%90%86%E5%92%8C%E5%8F%AF%E8%A7%86%E5%8C%96" class="hash-link" aria-label="Direct link to 数据处理和可视化" title="Direct link to 数据处理和可视化">​</a></h3>
<p>收集到的BLE信号数据可以通过多种方式进行处理和可视化：</p>
<ol>
<li><strong>实时显示</strong>：使用GUI库（如Tkinter、PyQt等）实时显示检测到的设备和信号强度</li>
<li><strong>数据存储</strong>：将数据保存到本地文件或数据库中</li>
<li><strong>数据流处理</strong>：使用Kafka等消息队列进行实时数据流处理</li>
<li><strong>信号强度可视化</strong>：使用matplotlib等库绘制信号强度热图或时间序列图</li>
</ol>
<p>以下是一个简单的数据处理函数示例：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">process_beacon</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">beacon_type</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> beacon_data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:#e3116c">"""处理信标数据"""</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    timestamp </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> datetime</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">datetime</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">now</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">isoformat</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 添加通用字段</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    message </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">'type'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> beacon_type</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">'timestamp'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> timestamp</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">'rssi'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> beacon_data</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'rssi'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">'address'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> beacon_data</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'address'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'unknown'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 添加特定类型的字段</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    message</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">update</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">beacon_data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 这里可以添加数据处理逻辑</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 例如：保存到文件、发送到服务器、更新GUI等</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> message</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="完整示例">完整示例<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%AE%8C%E6%95%B4%E7%A4%BA%E4%BE%8B" class="hash-link" aria-label="Direct link to 完整示例" title="Direct link to 完整示例">​</a></h3>
<p>下面是一个简单但完整的BLE扫描器示例，它会扫描周围的BLE设备并打印出检测到的信标信息：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> asyncio</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> bleak </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> BleakScanner</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> uuid</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> datetime</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">async</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">main</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"开始扫描BLE设备..."</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 扫描设备</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    devices </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">await</span><span class="token plain"> BleakScanner</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">discover</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">timeout</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">5.0</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"发现 </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation builtin">len</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">(</span><span class="token string-interpolation interpolation">devices</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">)</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c"> 个设备"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 处理每个设备</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> device </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> devices</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"设备: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">device</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">.</span><span class="token string-interpolation interpolation">address</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c"> (</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">device</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">.</span><span class="token string-interpolation interpolation">name </span><span class="token string-interpolation interpolation keyword" style="color:#00009f">or</span><span class="token string-interpolation interpolation"> </span><span class="token string-interpolation interpolation string" style="color:#e3116c">'Unknown'</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">), RSSI: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">device</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">.</span><span class="token string-interpolation interpolation">rssi</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># 提取制造商数据</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">metadata</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'manufacturer_data'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> company_code</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> data </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">metadata</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">'manufacturer_data'</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">items</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># 检查iBeacon</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> company_code </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x004C</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">23</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x02</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0x15</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token comment" style="color:#999988;font-style:italic"># 解析iBeacon数据</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        uuid_bytes </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">18</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        uuid_str </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">str</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">uuid</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">UUID</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">bytes</span><span class="token operator" style="color:#393A34">=</span><span class="token builtin">bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">uuid_bytes</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        major </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">int</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">from_bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">18</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">20</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> byteorder</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'big'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        minor </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">int</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">from_bytes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">20</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">22</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> byteorder</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'big'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"  iBeacon: UUID=</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">uuid_str</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">, Major=</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">major</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">, Minor=</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">minor</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> __name__ </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"__main__"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    asyncio</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">main</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="信号强度可视化">信号强度可视化<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E4%BF%A1%E5%8F%B7%E5%BC%BA%E5%BA%A6%E5%8F%AF%E8%A7%86%E5%8C%96" class="hash-link" aria-label="Direct link to 信号强度可视化" title="Direct link to 信号强度可视化">​</a></h3>
<p>为了更直观地展示BLE信号强度，我们可以使用matplotlib库创建可视化图表。以下是一个简单的示例，展示如何绘制信号强度随时间变化的图表：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> matplotlib</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pyplot </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> plt</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> numpy </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> np</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> time</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> asyncio</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> bleak </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> BleakScanner</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">async</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">monitor_device</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">address</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> duration</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">60</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:#e3116c">"""监控特定设备的信号强度"""</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    timestamps </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    rssi_values </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    start_time </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> time</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">time</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    end_time </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> start_time </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> duration</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">while</span><span class="token plain"> time</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">time</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&lt;</span><span class="token plain"> end_time</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># 扫描设备</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        devices </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">await</span><span class="token plain"> BleakScanner</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">discover</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">timeout</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">1.0</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># 查找目标设备</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> device </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> devices</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">address </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> address</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token comment" style="color:#999988;font-style:italic"># 记录时间和RSSI</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                timestamps</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">time</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">time</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> start_time</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                rssi_values</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">device</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">rssi</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"时间: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">timestamps</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">[</span><span class="token string-interpolation interpolation operator" style="color:#393A34">-</span><span class="token string-interpolation interpolation number" style="color:#36acaa">1</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">]</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">:</span><span class="token string-interpolation interpolation format-spec">.1f</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">s, RSSI: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">rssi_values</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">[</span><span class="token string-interpolation interpolation operator" style="color:#393A34">-</span><span class="token string-interpolation interpolation number" style="color:#36acaa">1</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">]</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c"> dBm"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">break</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># 等待下一次扫描</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">await</span><span class="token plain"> asyncio</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">sleep</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">0.5</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># 绘制图表</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">figure</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">figsize</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">6</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">plot</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">timestamps</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> rssi_values</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'b-'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">xlabel</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'时间 (秒)'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">ylabel</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'信号强度 (dBm)'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">title</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f'设备 </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">address</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c"> 的BLE信号强度'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">grid</span><span class="token punctuation" style="color:#393A34">(</span><span class="token boolean" style="color:#36acaa">True</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">savefig</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'ble_signal_strength.png'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    plt</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">show</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># 使用示例</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># asyncio.run(monitor_device('XX:XX:XX:XX:XX:XX', 60))</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="ios中监听ibeacon信号">iOS中监听iBeacon信号<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#ios%E4%B8%AD%E7%9B%91%E5%90%ACibeacon%E4%BF%A1%E5%8F%B7" class="hash-link" aria-label="Direct link to iOS中监听iBeacon信号" title="Direct link to iOS中监听iBeacon信号">​</a></h2>
<p>在iOS中，泛用的BLE监听无法争取的获取iBeacon的信号数据，尤其是UUID、Major、Minor等信息将无法获取。</p>
<p>因此我们需要特别的制定你要监听的UUID来进行监听。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="蓝牙后台运行的时间问题">蓝牙后台运行的时间问题<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E8%93%9D%E7%89%99%E5%90%8E%E5%8F%B0%E8%BF%90%E8%A1%8C%E7%9A%84%E6%97%B6%E9%97%B4%E9%97%AE%E9%A2%98" class="hash-link" aria-label="Direct link to 蓝牙后台运行的时间问题" title="Direct link to 蓝牙后台运行的时间问题">​</a></h3>
<p>蓝牙后台模式无法保持持续不断的运行，因此无法依靠其后台模式持续不断的进行。
因此，如果我们有较长时间持续不断的运行的需求，需要显式的将其放置在应用前端。</p>
<p>在进行屏幕切分的状态，左右两边的程序，都会认为是在持续运行状态。</p>
<p>即使是苹果企业版用户，也无法强制让某一个应用长时间在后台运行。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="总结">总结<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E6%80%BB%E7%BB%93" class="hash-link" aria-label="Direct link to 总结" title="Direct link to 总结">​</a></h2>
<p>通过Python和bleak库，我们可以轻松地检测和分析BLE信号。这种方法适用于多种应用场景，如室内定位、资产追踪、存在检测等。</p>
<p>在实际应用中，我们可以根据需要扩展上述代码，例如：</p>
<ul>
<li>添加距离估算（基于RSSI和路径损耗模型）</li>
<li>实现三边测量定位算法</li>
<li>开发实时监控仪表板</li>
<li>集成机器学习算法进行模式识别</li>
</ul>
<p>BLE信标技术结合Python的灵活性，为我们提供了一个强大的工具，可以用于构建各种智能空间应用。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="参考资料">参考资料<a href="https://eason-projects.github.io/eason-blog/blog/ble-beacon#%E5%8F%82%E8%80%83%E8%B5%84%E6%96%99" class="hash-link" aria-label="Direct link to 参考资料" title="Direct link to 参考资料">​</a></h2>
<ul>
<li><a href="https://github.com/yishi-projects/ble-beacon" target="_blank" rel="noopener noreferrer">yishi-projects/ble-beacon</a> - BLE信标检测项目</li>
<li><a href="https://bleak.readthedocs.io/" target="_blank" rel="noopener noreferrer">Bleak文档</a> - 跨平台BLE客户端库</li>
<li><a href="https://www.bluetooth.com/" target="_blank" rel="noopener noreferrer">蓝牙SIG</a> - 蓝牙技术标准</li>
</ul>]]></content:encoded>
            <category>Bluetooth</category>
        </item>
        <item>
            <title><![CDATA[基于Docker的机器学习开发环境]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/ml-env-setup</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/ml-env-setup</guid>
            <pubDate>Tue, 18 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[在我们进行机器学习开发的过车中，我们可能需要一些实验环境。]]></description>
            <content:encoded><![CDATA[<p>在我们进行机器学习开发的过车中，我们可能需要一些实验环境。
比如已经预制安装好Python环境，以及常用的Python包，比如sklearn, seaborn等工具。</p>
<p>为了将隔绝不同的开发环境，我们本文中使用Docker来构建相关的环境，
以便进行机器学习的开发工作。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="背景介绍">背景介绍<a href="https://eason-projects.github.io/eason-blog/blog/ml-env-setup#%E8%83%8C%E6%99%AF%E4%BB%8B%E7%BB%8D" class="hash-link" aria-label="Direct link to 背景介绍" title="Direct link to 背景介绍">​</a></h2>
<p>在日常的机器学习开发过程中，比较简单的方案是通过下载安装<a href="https://www.anaconda.com/" target="_blank" rel="noopener noreferrer">Anaconda</a>，
并通过<code>conda</code>命令来管理不同的环境。</p>
<p><img decoding="async" loading="lazy" alt="Anaconda Logo" src="https://eason-projects.github.io/eason-blog/assets/images/anaconda_secondary_logo-f5a5a1bc1460566805d7566ebbded336.svg" width="177" height="30" class="img_e6Vo"></p>
<p>或者是通过Python的<a href="https://docs.python.org/3/library/venv.html" target="_blank" rel="noopener noreferrer">venv</a>等工具，
来构建一个相对独立的环境。</p>
<p>虽然这些设置有一定的隔离性，但是还是不够彻底。</p>
<p>因此，本文我们会介绍一个基于Docker的方案。来构建机器学习工程师所必要的开发环境。</p>
<p>我们的目标用户主要是那些使用传统机器学习算法来解决比较简单的机器学习任务的工程师。
针对深度学习的相关开发环境，我们也会简单介绍一下。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="jupyter">Jupyter<a href="https://eason-projects.github.io/eason-blog/blog/ml-env-setup#jupyter" class="hash-link" aria-label="Direct link to Jupyter" title="Direct link to Jupyter">​</a></h2>
<p>对于一个机器学习工程师来说，使用最多的开发环境，
肯定是<a href="https://jupyter.org/" target="_blank" rel="noopener noreferrer">Jupyter</a>环境，
尤其是<a href="https://github.com/jupyterlab" target="_blank" rel="noopener noreferrer">Jupyter Lab</a>
这样的互动开发环境。</p>
<p><img decoding="async" loading="lazy" alt="Jupyter Logo" src="https://eason-projects.github.io/eason-blog/assets/images/jupyter-logo-f9e82ab8cc1a14dc6cd3763265733bdb.png" width="855" height="394" class="img_e6Vo"></p>
<p>Jupyter项目组，针对Docker的部署方式，专门有一个独立项目来介绍相关的Docker镜像：
<a href="https://jupyter-docker-stacks.readthedocs.io/" target="_blank" rel="noopener noreferrer">Jupyter Docker Stacks</a></p>
<p>在该项目中，有多个镜像可供选择，我们通过如下的图来详细解释下：</p>
<p><img decoding="async" loading="lazy" alt="Jupyter Docker Images" src="data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0nMS4wJyBlbmNvZGluZz0nVVRGLTgnPz4KPCFET0NUWVBFIHN2ZyBQVUJMSUMgIi0vL1czQy8vRFREIFNWRyAxLjAvL0VOIiAiaHR0cDovL3d3dy53My5vcmcvVFIvMjAwMS9SRUMtU1ZHLTIwMDEwOTA0L0RURC9zdmcxMC5kdGQiPgo8c3ZnIHZpZXdCb3g9IjAgMCA4MzIgNjAwIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHhtbG5zOmlua3NwYWNlPSJodHRwOi8vd3d3Lmlua3NjYXBlLm9yZy9uYW1lc3BhY2VzL2lua3NjYXBlIiB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayI+CiAgPGRlZnMgaWQ9ImRlZnNfYmxvY2siPgogICAgPGZpbHRlciBoZWlnaHQ9IjEuNTA0IiBpZD0iZmlsdGVyX2JsdXIiIGlua3NwYWNlOmNvbGxlY3Q9ImFsd2F5cyIgd2lkdGg9IjEuMTU3NSIgeD0iLTAuMDc4NzUiIHk9Ii0wLjI1MiI+CiAgICAgIDxmZUdhdXNzaWFuQmx1ciBpZD0iZmVHYXVzc2lhbkJsdXIzNzgwIiBpbmtzcGFjZTpjb2xsZWN0PSJhbHdheXMiIHN0ZERldmlhdGlvbj0iNC4yIiAvPgogICAgPC9maWx0ZXI+CiAgPC9kZWZzPgogIDx0aXRsZT5ibG9ja2RpYWc8L3RpdGxlPgogIDxkZXNjIC8+CiAgPHJlY3QgZmlsbD0icmdiKDAsMCwwKSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiBzdHlsZT0iZmlsdGVyOnVybCgjZmlsdGVyX2JsdXIpO29wYWNpdHk6MC43O2ZpbGwtb3BhY2l0eToxIiB3aWR0aD0iMTI4IiB4PSI2NyIgeT0iNDYiIC8+CiAgPHJlY3QgZmlsbD0icmdiKDAsMCwwKSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiBzdHlsZT0iZmlsdGVyOnVybCgjZmlsdGVyX2JsdXIpO29wYWNpdHk6MC43O2ZpbGwtb3BhY2l0eToxIiB3aWR0aD0iMTI4IiB4PSI2NyIgeT0iMTI2IiAvPgogIDxyZWN0IGZpbGw9InJnYigwLDAsMCkiIGhlaWdodD0iNDAiIHN0cm9rZT0icmdiKDAsMCwwKSIgc3R5bGU9ImZpbHRlcjp1cmwoI2ZpbHRlcl9ibHVyKTtvcGFjaXR5OjAuNztmaWxsLW9wYWNpdHk6MSIgd2lkdGg9IjEyOCIgeD0iNjciIHk9IjIwNiIgLz4KICA8cmVjdCBmaWxsPSJyZ2IoMCwwLDApIiBoZWlnaHQ9IjQwIiBzdHJva2U9InJnYigwLDAsMCkiIHN0eWxlPSJmaWx0ZXI6dXJsKCNmaWx0ZXJfYmx1cik7b3BhY2l0eTowLjc7ZmlsbC1vcGFjaXR5OjEiIHdpZHRoPSIxMjgiIHg9IjY3IiB5PSIyODYiIC8+CiAgPHJlY3QgZmlsbD0icmdiKDAsMCwwKSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiBzdHlsZT0iZmlsdGVyOnVybCgjZmlsdGVyX2JsdXIpO29wYWNpdHk6MC43O2ZpbGwtb3BhY2l0eToxIiB3aWR0aD0iMTI4IiB4PSI2NyIgeT0iMzY2IiAvPgogIDxyZWN0IGZpbGw9InJnYigwLDAsMCkiIGhlaWdodD0iNDAiIHN0cm9rZT0icmdiKDAsMCwwKSIgc3R5bGU9ImZpbHRlcjp1cmwoI2ZpbHRlcl9ibHVyKTtvcGFjaXR5OjAuNztmaWxsLW9wYWNpdHk6MSIgd2lkdGg9IjEyOCIgeD0iMjU5IiB5PSIzNjYiIC8+CiAgPHJlY3QgZmlsbD0icmdiKDAsMCwwKSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiBzdHlsZT0iZmlsdGVyOnVybCgjZmlsdGVyX2JsdXIpO29wYWNpdHk6MC43O2ZpbGwtb3BhY2l0eToxIiB3aWR0aD0iMTI4IiB4PSI0NTEiIHk9IjM2NiIgLz4KICA8cmVjdCBmaWxsPSJyZ2IoMCwwLDApIiBoZWlnaHQ9IjQwIiBzdHJva2U9InJnYigwLDAsMCkiIHN0eWxlPSJmaWx0ZXI6dXJsKCNmaWx0ZXJfYmx1cik7b3BhY2l0eTowLjc7ZmlsbC1vcGFjaXR5OjEiIHdpZHRoPSIxMjgiIHg9IjY3IiB5PSI0NDYiIC8+CiAgPHJlY3QgZmlsbD0icmdiKDAsMCwwKSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiBzdHlsZT0iZmlsdGVyOnVybCgjZmlsdGVyX2JsdXIpO29wYWNpdHk6MC43O2ZpbGwtb3BhY2l0eToxIiB3aWR0aD0iMTI4IiB4PSIyNTkiIHk9IjQ0NiIgLz4KICA8cmVjdCBmaWxsPSJyZ2IoMCwwLDApIiBoZWlnaHQ9IjQwIiBzdHJva2U9InJnYigwLDAsMCkiIHN0eWxlPSJmaWx0ZXI6dXJsKCNmaWx0ZXJfYmx1cik7b3BhY2l0eTowLjc7ZmlsbC1vcGFjaXR5OjEiIHdpZHRoPSIxMjgiIHg9IjQ1MSIgeT0iNDQ2IiAvPgogIDxyZWN0IGZpbGw9InJnYigwLDAsMCkiIGhlaWdodD0iNDAiIHN0cm9rZT0icmdiKDAsMCwwKSIgc3R5bGU9ImZpbHRlcjp1cmwoI2ZpbHRlcl9ibHVyKTtvcGFjaXR5OjAuNztmaWxsLW9wYWNpdHk6MSIgd2lkdGg9IjEyOCIgeD0iNjQzIiB5PSI0NDYiIC8+CiAgPHJlY3QgZmlsbD0icmdiKDAsMCwwKSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiBzdHlsZT0iZmlsdGVyOnVybCgjZmlsdGVyX2JsdXIpO29wYWNpdHk6MC43O2ZpbGwtb3BhY2l0eToxIiB3aWR0aD0iMTI4IiB4PSI2NDMiIHk9IjUyNiIgLz4KICA8cmVjdCBmaWxsPSJyZ2IoMjU1LDI1NSwyNTUpIiBoZWlnaHQ9IjQwIiBzdHJva2U9InJnYigwLDAsMCkiIHdpZHRoPSIxMjgiIHg9IjY0IiB5PSI0MCIgLz4KICA8dGV4dCBmaWxsPSJyZ2IoMCwwLDApIiBmb250LWZhbWlseT0ic2Fucy1zZXJpZiIgZm9udC1zaXplPSI5IiBmb250LXN0eWxlPSJub3JtYWwiIGZvbnQtd2VpZ2h0PSJub3JtYWwiIHRleHQtYW5jaG9yPSJtaWRkbGUiIHRleHRMZW5ndGg9IjMwIiB4PSIxMjgiIHk9IjU5Ij51YnVudHU8L3RleHQ+CiAgPHRleHQgZmlsbD0icmdiKDAsMCwwKSIgZm9udC1mYW1pbHk9InNhbnMtc2VyaWYiIGZvbnQtc2l6ZT0iOSIgZm9udC1zdHlsZT0ibm9ybWFsIiBmb250LXdlaWdodD0ibm9ybWFsIiB0ZXh0LWFuY2hvcj0ibWlkZGxlIiB0ZXh0TGVuZ3RoPSIxMTkiIHg9IjEyOCIgeT0iNzAiPihMVFMgd2l0aCBwb2ludCByZWxlYXNlKTwvdGV4dD4KICA8cmVjdCBmaWxsPSJyZ2IoMjU1LDI1NSwyNTUpIiBoZWlnaHQ9IjQwIiBzdHJva2U9InJnYigwLDAsMCkiIHdpZHRoPSIxMjgiIHg9IjY0IiB5PSIxMjAiIC8+CiAgPHRleHQgZmlsbD0icmdiKDAsMCwwKSIgZm9udC1mYW1pbHk9InNhbnMtc2VyaWYiIGZvbnQtc2l6ZT0iOSIgZm9udC1zdHlsZT0ibm9ybWFsIiBmb250LXdlaWdodD0ibm9ybWFsIiB0ZXh0LWFuY2hvcj0ibWlkZGxlIiB0ZXh0TGVuZ3RoPSIxMTkiIHg9IjEyOCIgeT0iMTQ1Ij5kb2NrZXItc3RhY2tzLWZvdW5kYXRpb248L3RleHQ+CiAgPHJlY3QgZmlsbD0icmdiKDI1NSwyNTUsMjU1KSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiB3aWR0aD0iMTI4IiB4PSI2NCIgeT0iMjAwIiAvPgogIDx0ZXh0IGZpbGw9InJnYigwLDAsMCkiIGZvbnQtZmFtaWx5PSJzYW5zLXNlcmlmIiBmb250LXNpemU9IjkiIGZvbnQtc3R5bGU9Im5vcm1hbCIgZm9udC13ZWlnaHQ9Im5vcm1hbCIgdGV4dC1hbmNob3I9Im1pZGRsZSIgdGV4dExlbmd0aD0iNjUiIHg9IjEyOCIgeT0iMjI1Ij5iYXNlLW5vdGVib29rPC90ZXh0PgogIDxyZWN0IGZpbGw9InJnYigyNTUsMjU1LDI1NSkiIGhlaWdodD0iNDAiIHN0cm9rZT0icmdiKDAsMCwwKSIgd2lkdGg9IjEyOCIgeD0iNjQiIHk9IjI4MCIgLz4KICA8dGV4dCBmaWxsPSJyZ2IoMCwwLDApIiBmb250LWZhbWlseT0ic2Fucy1zZXJpZiIgZm9udC1zaXplPSI5IiBmb250LXN0eWxlPSJub3JtYWwiIGZvbnQtd2VpZ2h0PSJub3JtYWwiIHRleHQtYW5jaG9yPSJtaWRkbGUiIHRleHRMZW5ndGg9IjgwIiB4PSIxMjgiIHk9IjMwNSI+bWluaW1hbC1ub3RlYm9vazwvdGV4dD4KICA8cmVjdCBmaWxsPSJyZ2IoMjU1LDI1NSwyNTUpIiBoZWlnaHQ9IjQwIiBzdHJva2U9InJnYigwLDAsMCkiIHdpZHRoPSIxMjgiIHg9IjY0IiB5PSIzNjAiIC8+CiAgPHRleHQgZmlsbD0icmdiKDAsMCwwKSIgZm9udC1mYW1pbHk9InNhbnMtc2VyaWYiIGZvbnQtc2l6ZT0iOSIgZm9udC1zdHlsZT0ibm9ybWFsIiBmb250LXdlaWdodD0ibm9ybWFsIiB0ZXh0LWFuY2hvcj0ibWlkZGxlIiB0ZXh0TGVuZ3RoPSI3MCIgeD0iMTI4IiB5PSIzODUiPnNjaXB5LW5vdGVib29rPC90ZXh0PgogIDxyZWN0IGZpbGw9InJnYigyNTUsMjU1LDI1NSkiIGhlaWdodD0iNDAiIHN0cm9rZT0icmdiKDAsMCwwKSIgd2lkdGg9IjEyOCIgeD0iMjU2IiB5PSIzNjAiIC8+CiAgPHRleHQgZmlsbD0icmdiKDAsMCwwKSIgZm9udC1mYW1pbHk9InNhbnMtc2VyaWYiIGZvbnQtc2l6ZT0iOSIgZm9udC1zdHlsZT0ibm9ybWFsIiBmb250LXdlaWdodD0ibm9ybWFsIiB0ZXh0LWFuY2hvcj0ibWlkZGxlIiB0ZXh0TGVuZ3RoPSI1MCIgeD0iMzIwIiB5PSIzODUiPnItbm90ZWJvb2s8L3RleHQ+CiAgPHJlY3QgZmlsbD0icmdiKDI1NSwyNTUsMjU1KSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiB3aWR0aD0iMTI4IiB4PSI0NDgiIHk9IjM2MCIgLz4KICA8dGV4dCBmaWxsPSJyZ2IoMCwwLDApIiBmb250LWZhbWlseT0ic2Fucy1zZXJpZiIgZm9udC1zaXplPSI5IiBmb250LXN0eWxlPSJub3JtYWwiIGZvbnQtd2VpZ2h0PSJub3JtYWwiIHRleHQtYW5jaG9yPSJtaWRkbGUiIHRleHRMZW5ndGg9IjcwIiB4PSI1MTIiIHk9IjM4NSI+anVsaWEtbm90ZWJvb2s8L3RleHQ+CiAgPHJlY3QgZmlsbD0icmdiKDI1NSwyNTUsMjU1KSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiB3aWR0aD0iMTI4IiB4PSI2NCIgeT0iNDQwIiAvPgogIDx0ZXh0IGZpbGw9InJnYigwLDAsMCkiIGZvbnQtZmFtaWx5PSJzYW5zLXNlcmlmIiBmb250LXNpemU9IjkiIGZvbnQtc3R5bGU9Im5vcm1hbCIgZm9udC13ZWlnaHQ9Im5vcm1hbCIgdGV4dC1hbmNob3I9Im1pZGRsZSIgdGV4dExlbmd0aD0iOTUiIHg9IjEyOCIgeT0iNDU5Ij50ZW5zb3JmbG93LW5vdGVib29rPC90ZXh0PgogIDx0ZXh0IGZpbGw9InJnYigwLDAsMCkiIGZvbnQtZmFtaWx5PSJzYW5zLXNlcmlmIiBmb250LXNpemU9IjkiIGZvbnQtc3R5bGU9Im5vcm1hbCIgZm9udC13ZWlnaHQ9Im5vcm1hbCIgdGV4dC1hbmNob3I9Im1pZGRsZSIgdGV4dExlbmd0aD0iNjUiIHg9IjEyOCIgeT0iNDcwIj4rY3VkYSB2YXJpYW50PC90ZXh0PgogIDxyZWN0IGZpbGw9InJnYigyNTUsMjU1LDI1NSkiIGhlaWdodD0iNDAiIHN0cm9rZT0icmdiKDAsMCwwKSIgd2lkdGg9IjEyOCIgeD0iMjU2IiB5PSI0NDAiIC8+CiAgPHRleHQgZmlsbD0icmdiKDAsMCwwKSIgZm9udC1mYW1pbHk9InNhbnMtc2VyaWYiIGZvbnQtc2l6ZT0iOSIgZm9udC1zdHlsZT0ibm9ybWFsIiBmb250LXdlaWdodD0ibm9ybWFsIiB0ZXh0LWFuY2hvcj0ibWlkZGxlIiB0ZXh0TGVuZ3RoPSI4MCIgeD0iMzIwIiB5PSI0NTkiPnB5dG9yY2gtbm90ZWJvb2s8L3RleHQ+CiAgPHRleHQgZmlsbD0icmdiKDAsMCwwKSIgZm9udC1mYW1pbHk9InNhbnMtc2VyaWYiIGZvbnQtc2l6ZT0iOSIgZm9udC1zdHlsZT0ibm9ybWFsIiBmb250LXdlaWdodD0ibm9ybWFsIiB0ZXh0LWFuY2hvcj0ibWlkZGxlIiB0ZXh0TGVuZ3RoPSIxMTQiIHg9IjMyMCIgeT0iNDcwIj4rY3VkYTExL2N1ZGExMiB2YXJpYW50czwvdGV4dD4KICA8cmVjdCBmaWxsPSJyZ2IoMjU1LDI1NSwyNTUpIiBoZWlnaHQ9IjQwIiBzdHJva2U9InJnYigwLDAsMCkiIHdpZHRoPSIxMjgiIHg9IjQ0OCIgeT0iNDQwIiAvPgogIDx0ZXh0IGZpbGw9InJnYigwLDAsMCkiIGZvbnQtZmFtaWx5PSJzYW5zLXNlcmlmIiBmb250LXNpemU9IjkiIGZvbnQtc3R5bGU9Im5vcm1hbCIgZm9udC13ZWlnaHQ9Im5vcm1hbCIgdGV4dC1hbmNob3I9Im1pZGRsZSIgdGV4dExlbmd0aD0iMTAwIiB4PSI1MTIiIHk9IjQ2NSI+ZGF0YXNjaWVuY2Utbm90ZWJvb2s8L3RleHQ+CiAgPHJlY3QgZmlsbD0icmdiKDI1NSwyNTUsMjU1KSIgaGVpZ2h0PSI0MCIgc3Ryb2tlPSJyZ2IoMCwwLDApIiB3aWR0aD0iMTI4IiB4PSI2NDAiIHk9IjQ0MCIgLz4KICA8dGV4dCBmaWxsPSJyZ2IoMCwwLDApIiBmb250LWZhbWlseT0ic2Fucy1zZXJpZiIgZm9udC1zaXplPSI5IiBmb250LXN0eWxlPSJub3JtYWwiIGZvbnQtd2VpZ2h0PSJub3JtYWwiIHRleHQtYW5jaG9yPSJtaWRkbGUiIHRleHRMZW5ndGg9IjgwIiB4PSI3MDQiIHk9IjQ2NSI+cHlzcGFyay1ub3RlYm9vazwvdGV4dD4KICA8cmVjdCBmaWxsPSJyZ2IoMjU1LDI1NSwyNTUpIiBoZWlnaHQ9IjQwIiBzdHJva2U9InJnYigwLDAsMCkiIHdpZHRoPSIxMjgiIHg9IjY0MCIgeT0iNTIwIiAvPgogIDx0ZXh0IGZpbGw9InJnYigwLDAsMCkiIGZvbnQtZmFtaWx5PSJzYW5zLXNlcmlmIiBmb250LXNpemU9IjkiIGZvbnQtc3R5bGU9Im5vcm1hbCIgZm9udC13ZWlnaHQ9Im5vcm1hbCIgdGV4dC1hbmNob3I9Im1pZGRsZSIgdGV4dExlbmd0aD0iOTAiIHg9IjcwNCIgeT0iNTQ1Ij5hbGwtc3Bhcmstbm90ZWJvb2s8L3RleHQ+CiAgPHBhdGggZD0iTSAxMjggODAgTCAxMjggMTEyIiBmaWxsPSJub25lIiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBvbHlnb24gZmlsbD0icmdiKDAsMCwwKSIgcG9pbnRzPSIxMjgsMTE5IDEyNCwxMTIgMTMyLDExMiAxMjgsMTE5IiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBhdGggZD0iTSAxMjggMTYwIEwgMTI4IDE5MiIgZmlsbD0ibm9uZSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwb2x5Z29uIGZpbGw9InJnYigwLDAsMCkiIHBvaW50cz0iMTI4LDE5OSAxMjQsMTkyIDEzMiwxOTIgMTI4LDE5OSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwYXRoIGQ9Ik0gMTI4IDI0MCBMIDEyOCAyNzIiIGZpbGw9Im5vbmUiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KICA8cG9seWdvbiBmaWxsPSJyZ2IoMCwwLDApIiBwb2ludHM9IjEyOCwyNzkgMTI0LDI3MiAxMzIsMjcyIDEyOCwyNzkiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KICA8cGF0aCBkPSJNIDEyOCAzMjAgTCAxMjggMzUyIiBmaWxsPSJub25lIiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBvbHlnb24gZmlsbD0icmdiKDAsMCwwKSIgcG9pbnRzPSIxMjgsMzU5IDEyNCwzNTIgMTMyLDM1MiAxMjgsMzU5IiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBhdGggZD0iTSAxMjggMzIwIEwgMTI4IDM0MCIgZmlsbD0ibm9uZSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwYXRoIGQ9Ik0gMTI4IDM0MCBMIDMyMCAzNDAiIGZpbGw9Im5vbmUiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KICA8cGF0aCBkPSJNIDMyMCAzNDAgTCAzMjAgMzUyIiBmaWxsPSJub25lIiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBvbHlnb24gZmlsbD0icmdiKDAsMCwwKSIgcG9pbnRzPSIzMjAsMzU5IDMxNiwzNTIgMzI0LDM1MiAzMjAsMzU5IiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBhdGggZD0iTSAxMjggMzIwIEwgMTI4IDM0MCIgZmlsbD0ibm9uZSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwYXRoIGQ9Ik0gMTI4IDM0MCBMIDUxMiAzNDAiIGZpbGw9Im5vbmUiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KICA8cGF0aCBkPSJNIDUxMiAzNDAgTCA1MTIgMzUyIiBmaWxsPSJub25lIiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBvbHlnb24gZmlsbD0icmdiKDAsMCwwKSIgcG9pbnRzPSI1MTIsMzU5IDUwOCwzNTIgNTE2LDM1MiA1MTIsMzU5IiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBhdGggZD0iTSAxMjggNDAwIEwgMTI4IDQyMCIgZmlsbD0ibm9uZSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwYXRoIGQ9Ik0gMTI4IDQyMCBMIDMyMCA0MjAiIGZpbGw9Im5vbmUiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KICA8cGF0aCBkPSJNIDMyMCA0MjAgTCAzMjAgNDMyIiBmaWxsPSJub25lIiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBvbHlnb24gZmlsbD0icmdiKDAsMCwwKSIgcG9pbnRzPSIzMjAsNDM5IDMxNiw0MzIgMzI0LDQzMiAzMjAsNDM5IiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBhdGggZD0iTSAxMjggNDAwIEwgMTI4IDQyMCIgZmlsbD0ibm9uZSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwYXRoIGQ9Ik0gMTI4IDQyMCBMIDcwNCA0MjAiIGZpbGw9Im5vbmUiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KICA8cGF0aCBkPSJNIDcwNCA0MjAgTCA3MDQgNDMyIiBmaWxsPSJub25lIiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBvbHlnb24gZmlsbD0icmdiKDAsMCwwKSIgcG9pbnRzPSI3MDQsNDM5IDcwMCw0MzIgNzA4LDQzMiA3MDQsNDM5IiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBhdGggZD0iTSAxMjggNDAwIEwgMTI4IDQzMiIgZmlsbD0ibm9uZSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwb2x5Z29uIGZpbGw9InJnYigwLDAsMCkiIHBvaW50cz0iMTI4LDQzOSAxMjQsNDMyIDEzMiw0MzIgMTI4LDQzOSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwYXRoIGQ9Ik0gMTI4IDQwMCBMIDEyOCA0MjAiIGZpbGw9Im5vbmUiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KICA8cGF0aCBkPSJNIDEyOCA0MjAgTCA1MTIgNDIwIiBmaWxsPSJub25lIiBzdHJva2U9InJnYigwLDAsMCkiIC8+CiAgPHBhdGggZD0iTSA1MTIgNDIwIEwgNTEyIDQzMiIgZmlsbD0ibm9uZSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwb2x5Z29uIGZpbGw9InJnYigwLDAsMCkiIHBvaW50cz0iNTEyLDQzOSA1MDgsNDMyIDUxNiw0MzIgNTEyLDQzOSIgc3Ryb2tlPSJyZ2IoMCwwLDApIiAvPgogIDxwYXRoIGQ9Ik0gNzA0IDQ4MCBMIDcwNCA1MTIiIGZpbGw9Im5vbmUiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KICA8cG9seWdvbiBmaWxsPSJyZ2IoMCwwLDApIiBwb2ludHM9IjcwNCw1MTkgNzAwLDUxMiA3MDgsNTEyIDcwNCw1MTkiIHN0cm9rZT0icmdiKDAsMCwwKSIgLz4KPC9zdmc+Cg==" width="832" height="600" class="img_e6Vo"></p>
<p>为了最大化我们的便利程度，我们可以有目的的选择基于<code>scipy-notebook</code>的镜像，即：</p>
<ul>
<li><code>datascience-notebook</code>：其包含了<code>scipy-notebook</code>以及针对R和Julia语言的支持。</li>
<li><code>tensorflow-notebook</code>：其包含了<code>scipy-notebook</code>以及针对<a href="https://www.tensorflow.org/" target="_blank" rel="noopener noreferrer">Tensorflow</a>的支持。</li>
<li><code>pytorch-notebook</code>：其包含了<code>scipy-notebook</code>以及针对<a href="https://pytorch.org/" target="_blank" rel="noopener noreferrer">PyTorch</a>的支持。</li>
<li><code>pyspark-notebook</code>：其包含了<code>scipy-notebook</code>以及针对
<a href="https://spark.apache.org/docs/latest/api/python/index.html" target="_blank" rel="noopener noreferrer">PySpark</a>的支持。</li>
</ul>
<p>由于我们在本文中的目标是针对比较大众的机器学习任务进行开发，因此我们会选择<code>datascience-notebook</code>这个版本的镜像进行介绍。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="docker运行">Docker运行<a href="https://eason-projects.github.io/eason-blog/blog/ml-env-setup#docker%E8%BF%90%E8%A1%8C" class="hash-link" aria-label="Direct link to Docker运行" title="Direct link to Docker运行">​</a></h2>
<p>我们在本节会介绍如何使用Docker来运行相关的代码，但是在运行前，我们需要梳理一下我们期望的要求：</p>
<ol>
<li>首先要打通容器内的文件和宿主机文件的映射关系，也就是我们需要挂载宿主机的文件夹到容器内。
这样方便我们保留操作过的数据，不至于容器关闭后，数据丢失。</li>
<li>我们需要通过端口可以访问到Jupyter Lab的网页界面，也就是我们希望进行端口映射，将容器内的端口，映射到宿主机上。</li>
<li>我们希望在容器内以管理员权限启动容器，这样在容器中，方便我们以管理员（root）权限安装或修改配置等。</li>
<li>我们希望在容器中安装的额外Python包文件，在重启容器后，依然予以保存。</li>
</ol>
<p>我们可以通过如下的启动命令，来启动相关的容器来满足我们上述的要求：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain"># Create volume before using it</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker volume create jupyter-data</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># Run docker container</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker run -d \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    --name jupyter-ds \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -p 28888:8888 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -v "$HOME/_jupyter_mount":/home/jovyan/work \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -v "$(pwd)":/home/jovyan/pwd \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -v jupyter-data:/home/jovyan/.local \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    --user root \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -e GRANT_SUDO=yes \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -e CHOWN_HOME=yes \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -e CHOWN_HOME_OPTS='-R' \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    jupyter/datascience-notebook</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>让我们来解释一下这个命令的各个参数：</p>
<ul>
<li><code>-d</code>: 以后台模式运行容器</li>
<li><code>--name jupyter-ds</code>: 给容器指定一个名称，方便后续管理</li>
<li><code>-p 28888:8888</code>: 将容器内的 8888 端口映射到主机的 28888 端口</li>
<li><code>-v "$HOME/_jupyter_mount":/home/jovyan/work</code>: 将用户目录下的<code>_jupyter_mount</code>目录挂载到容器内的工作目录</li>
<li><code>-v "$(pwd)":/home/jovyan/pwd</code>: 将当前目录挂载到容器中的<code>pwd</code>目录中</li>
<li><code>-v jupyter-data:/home/jovyan/.local</code>: 创建一个命名卷来持久化存储用户安装的包</li>
<li><code>--user root</code>: 以 root 用户运行容器</li>
<li><code>-e GRANT_SUDO=yes</code>: 允许使用 sudo 命令</li>
<li><code>-e CHOWN_HOME=yes</code>: 在容器启动时自动修改 home 目录的所有权</li>
<li><code>-e CHOWN_HOME_OPTS='-R'</code>: 递归修改所有子目录和文件的所有权，确保完整的权限设置</li>
<li><code>jupyter/datascience-notebook</code>: 使用的镜像名称</li>
</ul>
<p>启动容器后，可以通过以下命令查看 Jupyter Lab 的访问链接（以及访问密码）：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">docker logs jupyter-ds</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>现在我们可以打开本地的服务地址：<a href="http://localhost:28888/" target="_blank" rel="noopener noreferrer">http://localhost:28888</a>来访问我们的Jupyter Lab服务。
登陆成功后，我们可以看到界面：</p>
<p><img decoding="async" loading="lazy" alt="Jupyter Lab Landing Page" src="https://eason-projects.github.io/eason-blog/assets/images/jupyter-lab-landing-page-3d9a3d9ce0ba7778e1f897bcd09f4d2c.png" width="1461" height="1000" class="img_e6Vo"></p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="包管理">包管理<a href="https://eason-projects.github.io/eason-blog/blog/ml-env-setup#%E5%8C%85%E7%AE%A1%E7%90%86" class="hash-link" aria-label="Direct link to 包管理" title="Direct link to 包管理">​</a></h2>
<p>在使用过程中，我们可能需要安装一些额外的Python包。由于我们使用了数据卷来保存<code>.local</code>目录，
所以安装的包会被持久化保存。以下是安装包的几种方式：</p>
<ol>
<li>
<p>通过Jupyter Lab的终端安装：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">pip install --user package_name</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
</li>
<li>
<p>通过容器外部安装：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">docker exec -it jupyter-ds pip install --user package_name</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
</li>
<li>
<p>通过requirements.txt批量安装：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain"># 首先创建requirements.txt文件</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">echo "pandas==2.0.0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">scikit-learn==1.2.0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">matplotlib==3.7.0" &gt; requirements.txt</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># 将文件复制到容器内</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker cp requirements.txt jupyter-ds:/tmp/</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># 在容器内安装</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker exec -it jupyter-ds pip install --user -r /tmp/requirements.txt</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
</li>
</ol>
<p>注意：使用<code>--user</code>参数可以确保包被安装到用户目录下，这样可以避免权限问题，
并且由于我们挂载了数据卷，这些包会被持久化保存。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="容器管理">容器管理<a href="https://eason-projects.github.io/eason-blog/blog/ml-env-setup#%E5%AE%B9%E5%99%A8%E7%AE%A1%E7%90%86" class="hash-link" aria-label="Direct link to 容器管理" title="Direct link to 容器管理">​</a></h2>
<p>在日常使用过程中，我们可能需要对容器进行一些管理操作，以下是一些常用的命令：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain"># 停止容器</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker stop jupyter-ds</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># 启动已存在的容器</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker start jupyter-ds</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># 重启容器</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker restart jupyter-ds</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># 删除容器（需要先停止容器）</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker rm jupyter-ds</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># 查看容器日志</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker logs jupyter-ds</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"># 进入容器内部执行命令</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">docker exec -it jupyter-ds bash</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>这些命令可以帮助你更好地管理你的Jupyter环境。特别是当你需要安装新的包或者调试环境问题时，
使用<code>docker exec</code>命令进入容器内部会非常有用。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="总结">总结<a href="https://eason-projects.github.io/eason-blog/blog/ml-env-setup#%E6%80%BB%E7%BB%93" class="hash-link" aria-label="Direct link to 总结" title="Direct link to 总结">​</a></h2>
<p>在本文中，我们介绍了Jupyter的多个Docker镜像版本，
以及如何使用Docker来快速构建一个机器学习的开发环境。</p>
<p>需要注意的是，在不同版本的镜像中，可能存在版本不兼容的现象。
比如需要使用Tensorflow的专有功能，还是需要使用相应的镜像版本。</p>
<p>在使用这个环境时，有一些安全注意事项：</p>
<ol>
<li>虽然我们使用了root权限启动容器，但在日常使用中应该尽量避免使用root权限操作。</li>
<li>建议修改默认的Jupyter登录密码，可以通过设置环境变量<code>JUPYTER_TOKEN</code>来实现。</li>
<li>如果在生产环境使用，建议配置HTTPS，并限制访问IP。</li>
<li>定期更新Docker镜像以获取安全补丁。</li>
</ol>]]></content:encoded>
            <category>Machine Learning</category>
        </item>
        <item>
            <title><![CDATA[使用MLflow和Ray训练fastText]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/fasttext</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/fasttext</guid>
            <pubDate>Sun, 16 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[fastText是Facebook研发的一款针对NLP领域的解决方案。]]></description>
            <content:encoded><![CDATA[<p>fastText是Facebook研发的一款针对NLP领域的解决方案。</p>
<p>其主要提供了文本分类和词向量学习两大功能。
其核心思想是将整句话的词向量叠加平均作为文本表示，
并使用softmax分类器进行分类。</p>
<p>我们通过本文介绍一下如何使用MLflow以及Ray来训练我们的fastText模型。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="fasttext介绍">fastText介绍<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#fasttext%E4%BB%8B%E7%BB%8D" class="hash-link" aria-label="Direct link to fastText介绍" title="Direct link to fastText介绍">​</a></h2>
<p>fastText作为一个高效的文本分类和词向量表示工具，fastText具有以下特点：</p>
<ol>
<li><strong>训练速度快</strong>：能够在普通多核CPU上几秒内处理数十亿个词，训练数百万个文本分类器</li>
<li><strong>效果优异</strong>：在文本分类任务中取得与深度学习模型相当的精度</li>
<li><strong>资源占用少</strong>：相比深度学习模型，fastText对硬件要求低，且模型文件小</li>
<li><strong>多语言支持</strong>：支持294种语言的词向量训练</li>
</ol>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="训练目标">训练目标<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E8%AE%AD%E7%BB%83%E7%9B%AE%E6%A0%87" class="hash-link" aria-label="Direct link to 训练目标" title="Direct link to 训练目标">​</a></h2>
<p>在本文中，我们会针对<a href="https://github.com/cooelf/DeepUtteranceAggregation/" target="_blank" rel="noopener noreferrer">淘宝客服对话数据</a>
这个数据集进行处理，我们希望可以训练一个分类器，
来对任意对话文本区分是客户还是客服的消息。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="数据准备">数据准备<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E6%95%B0%E6%8D%AE%E5%87%86%E5%A4%87" class="hash-link" aria-label="Direct link to 数据准备" title="Direct link to 数据准备">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="数据文件格式">数据文件格式<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E6%95%B0%E6%8D%AE%E6%96%87%E4%BB%B6%E6%A0%BC%E5%BC%8F" class="hash-link" aria-label="Direct link to 数据文件格式" title="Direct link to 数据文件格式">​</a></h3>
<p>我们通过上面的网址下载后，可以看到有3个数据文件，分别是 <code>train.txt</code>、 <code>dev.txt</code>以及<code>test.txt</code>。</p>
<p>打开任意的文件，其内部数据如下：</p>
<div class="language-plaintext codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-plaintext codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">1 在 吗 您好 现在 拍 几天 能 到 辽宁 这个 不 一定 哦 大概 几天 不 知道 么 一般 情况 下 3 到 5 天 左右</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">0 在 吗 您好 现在 拍 几天 能 到 辽宁 这个 不 一定 哦 大概 几天 不 知道 么 亲 不会 的 呢 您 放心</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>每一行表示为：</p>
<ul>
<li><code>1</code>或<code>0</code>：正确的对话流程以及错误的对话流程。</li>
<li>循环（用<code>\t</code>来隔开）：<!-- -->
<ul>
<li>客户问题</li>
<li>客服回答</li>
</ul>
</li>
</ul>
<p>比如上面的数据样本的第一行：</p>
<div class="codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-text codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">1 --&gt; 正确的对话</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">在 吗 --&gt; 客户问题</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">您好 --&gt; 客服回答</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">现在 拍 几天 能 到 辽宁 这个 不 一定 哦 大概 几天 不 知道 么 --&gt; 客户问题</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">一般 情况 下 3 到 5 天 左右 --&gt; 客服回答</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="fasttext要求的数据格式">fastText要求的数据格式<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#fasttext%E8%A6%81%E6%B1%82%E7%9A%84%E6%95%B0%E6%8D%AE%E6%A0%BC%E5%BC%8F" class="hash-link" aria-label="Direct link to fastText要求的数据格式" title="Direct link to fastText要求的数据格式">​</a></h3>
<p>fastText有自己独立的数据格式，其输入为文本文件，其每一行的数据格式为：</p>
<div class="codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-text codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">分类1 分类2 分类... 文本行</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>即，每一行可以关联多个分类，然后分类信息以及文本行信息以空格隔开。</p>
<p>其分类表示有独特的要求，比如我们希望构建两个分类：</p>
<ul>
<li><code>seller</code>：客服</li>
<li><code>customer</code>：客户</li>
</ul>
<p>那么其fastText表示为： <code>__label__customer</code>和<code>__label__seller</code>。</p>
<p>因此，我们需要将上述的原始数据文件，每行进行解析，并按照存储如下的格式的文件，如：</p>
<div class="codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-text codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 您好咱们这边如果能提升您的销量利润您会考虑跟我们合作吗</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 亲你家什么时候还有活动啊</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 我的订单怎么两天了没什么变化啊在么</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 好麻烦你改下谢谢</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 明天可以发货</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__seller 有的哦满68送95g猪肉脯一袋</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 这个买两个有优惠吗</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 亲亲已下单买这么多请掌柜的多送点小礼物啊</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 嗯嗯了解了哦</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 我买10个付款的时候怎么不打折呢</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__seller 您这边提交订单看看哦系统自动改价的和芒果干一样的</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__seller 不好意思亲可能快递途中挤压造成的这边退亲2元差价亲看可以吗</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">__label__customer 亲有原味瓜子么</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="处理代码">处理代码<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E5%A4%84%E7%90%86%E4%BB%A3%E7%A0%81" class="hash-link" aria-label="Direct link to 处理代码" title="Direct link to 处理代码">​</a></h3>
<p>我们可以构建下面的两个函数，来读取原始的数据文件，然后逐行按照上面的格式，构建一个包含了客户和客服的数据文件。
并按照fastText的格式，保存成训练或者验证文件。</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">load_data</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> </span><span class="token builtin">open</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'r'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> f</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        data </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> f</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">readlines</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    customer_utterances </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    seller_utterances </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> idx</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> line </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> tqdm</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">enumerate</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        utterances </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> line</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">strip</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">split</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'\t'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        utterances </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">re</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">sub</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">r'[\s\n\t]'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">''</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> utterance</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> utterance </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> utterances</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> utterances</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'1'</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            customer_utterances </span><span class="token operator" style="color:#393A34">+=</span><span class="token plain"> utterances</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">:</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            seller_utterances </span><span class="token operator" style="color:#393A34">+=</span><span class="token plain"> utterances</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">:</span><span class="token punctuation" style="color:#393A34">:</span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Remove utterances only contain digits. If it contains digits and other characters, keep it.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    customer_utterances </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">utterance </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> utterance </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> customer_utterances </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> re</span><span class="token punctuation" style="color:#393A34">.</span><span class="token keyword" style="color:#00009f">match</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">r'^[0-9]+$'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> utterance</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    seller_utterances </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">utterance </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> utterance </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> seller_utterances </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> re</span><span class="token punctuation" style="color:#393A34">.</span><span class="token keyword" style="color:#00009f">match</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">r'^[0-9]+$'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> utterance</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Only keep utterances with more than 5 characters</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    customer_utterances </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">utterance </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> utterance </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> customer_utterances </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">utterance</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    seller_utterances </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">utterance </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> utterance </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> seller_utterances </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token builtin">len</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">utterance</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># If the utterance are the same in both customer and seller, remove duplicates using sets</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    customer_set </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">set</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">customer_utterances</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    seller_set </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">set</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">seller_utterances</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Remove duplicates that appear in both sets</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    unique_customer </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">list</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">customer_set </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> seller_set</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    unique_seller </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token builtin">list</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">seller_set </span><span class="token operator" style="color:#393A34">-</span><span class="token plain"> customer_set</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> unique_customer</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> unique_seller</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">generate_fasttext_data</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">customer_utterances</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> seller_utterances</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> output_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:#e3116c">"""Generate FastText training data with labels."""</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> </span><span class="token builtin">open</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">output_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'w'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> encoding</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'utf-8'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> f</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># Write customer utterances</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> utterance </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> customer_utterances</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            f</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">write</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"__label__customer </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">utterance</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">\n"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># Write seller utterances</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> utterance </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> seller_utterances</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            f</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">write</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"__label__seller </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">utterance</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">\n"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"FastText training data saved to </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">output_path</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"Total samples: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation builtin">len</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">(</span><span class="token string-interpolation interpolation">customer_utterances</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">)</span><span class="token string-interpolation interpolation"> </span><span class="token string-interpolation interpolation operator" style="color:#393A34">+</span><span class="token string-interpolation interpolation"> </span><span class="token string-interpolation interpolation builtin">len</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">(</span><span class="token string-interpolation interpolation">seller_utterances</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">)</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"Customer samples: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation builtin">len</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">(</span><span class="token string-interpolation interpolation">customer_utterances</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">)</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"Seller samples: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation builtin">len</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">(</span><span class="token string-interpolation interpolation">seller_utterances</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">)</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>文件处理后，共有训练数据238,275条，其中客户对话138,429条，客服对话99,846条。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="训练">训练<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E8%AE%AD%E7%BB%83" class="hash-link" aria-label="Direct link to 训练" title="Direct link to 训练">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="fasttext训练">fastText训练<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#fasttext%E8%AE%AD%E7%BB%83" class="hash-link" aria-label="Direct link to fastText训练" title="Direct link to fastText训练">​</a></h3>
<p>我们可以通过<code>fasttext.train_supervised()</code>函数来训练我们的模型。比如，使用如下的内容进行训练：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> fasttext</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">train_supervised</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token builtin">input</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">input_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    epoch</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">100</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    lr</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">0.1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    wordNgrams</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    verbose</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">2</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">save_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">output_path</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>其中<code>input_path</code>就是我们上面整理好的数据文件。其他的参数，我们可以选择通过Ray Tune来帮我们进行寻找。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="ray-tune寻找参数">Ray Tune寻找参数<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#ray-tune%E5%AF%BB%E6%89%BE%E5%8F%82%E6%95%B0" class="hash-link" aria-label="Direct link to Ray Tune寻找参数" title="Direct link to Ray Tune寻找参数">​</a></h3>
<p>为了使用Ray Tune，我们需要定义一个训练的步骤，其接收训练的超参数配置，通过制定训练文件，以及测试文件。
代码如下：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">get_file_size_mb</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">file_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:#e3116c">"""</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">    Get file size in megabytes.</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">    Args:</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">        file_path (str): Path to the file</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">        </span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">    Returns:</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">        float: File size in MB</span><br></span><span class="token-line" style="color:#393A34"><span class="token triple-quoted-string string" style="color:#e3116c">    """</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    size_bytes </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> os</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">getsize</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">file_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    size_mb </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> size_bytes </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">1024</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1024</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> </span><span class="token builtin">round</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">size_mb</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">train_and_evaluate_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> input_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> test_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_tracking_uri</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">MLFLOW_TRACKING_URI</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_experiment</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"fasttext-demo-v3"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    context </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> ray</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">train</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_context</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    trial_id </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> context</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_trial_id</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">start_run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">run_name</span><span class="token operator" style="color:#393A34">=</span><span class="token string-interpolation string" style="color:#e3116c">f'trial_</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">trial_id</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> run</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        model_path </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string-interpolation string" style="color:#e3116c">f'model_trail_</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">trial_id</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">.bin'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        training_df </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> convert_file_into_dataframe</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">input_path</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">input_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        dataset </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">from_pandas</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            training_df</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'traing data'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> targets</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'label'</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_input</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">dataset</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">dataset</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> context</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'training'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_params</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># Use the dictionary's attributes in the function call</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> fasttext</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">train_supervised</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">**</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">save_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        model_size_mb </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> get_file_size_mb</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        num_of_samples</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> precision</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> recall </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">test</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">test_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        f1_score </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> precision </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> recall </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">precision </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> recall</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metrics</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> f1_score</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"precision"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> precision</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"recall"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> recall</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"model_size"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_output_matrix</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">size</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"model_size_mb"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model_size_mb</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">report</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> f1_score</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"precision"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> precision</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"recall"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> recall</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"model_size"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_output_matrix</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">size</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"model_size_mb"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model_size_mb</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"model_path"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>函数<code>train_and_evaluate_model</code>是我们用来训练和评估fastText模型的核心函数，它结合了MLflow和Ray Tune的功能。详细解析如下：</p>
<ol>
<li>
<p><strong>函数参数</strong>：</p>
<ul>
<li><code>config</code>：包含模型训练参数的字典</li>
<li><code>input_path</code>：训练数据文件的路径</li>
<li><code>test_path</code>：测试数据文件的路径</li>
</ul>
</li>
<li>
<p><strong>MLflow设置</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_tracking_uri</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">MLFLOW_TRACKING_URI</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_experiment</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"fasttext-demo-v3"</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>设置MLflow的追踪服务器地址并创建名为<code>fasttext-demo-v3</code>的实验。</p>
</li>
<li>
<p><strong>Ray Tune上下文</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">context </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> ray</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">train</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_context</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">trial_id </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> context</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_trial_id</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>获取Ray Tune的训练上下文和试验ID，用于区分不同的训练试验。其格式为8位16进制的字符串，例如：<code>01b8b86e</code>。</p>
</li>
<li>
<p><strong>数据准备</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">training_df </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> convert_file_into_dataframe</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">input_path</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">input_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">dataset </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">data</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">from_pandas</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">training_df</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'traing data'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> targets</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'label'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_input</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">dataset</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">dataset</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> context</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'training'</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>将训练数据转换为DataFrame，并创建MLflow数据集对象。
此步骤非必须，主要是为了演示如何使用MLflow的<code>mlflow.log_input</code>来保存训练的数据。</p>
</li>
<li>
<p><strong>模型训练</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> fasttext</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">train_supervised</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">**</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">save_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_path</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>使用配置参数训练fastText模型，并保存模型文件。</p>
</li>
<li>
<p><strong>模型评估</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">model_size_mb </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> get_file_size_mb</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">num_of_samples</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> precision</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> recall </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">test</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">test_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">f1_score </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> precision </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> recall </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">precision </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> recall</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>使用测试数据评估模型性能，计算精确率、召回率和F1分数。
同时我们也通过函数<code>get_file_size_mb</code>来计算生成模型文件的大小，
主要是考虑既满足模型的准确率等要求，同时，也要比较小的计算性能消耗。</p>
</li>
<li>
<p><strong>指标记录</strong>：</p>
<ul>
<li>
<p>使用MLflow记录参数和指标：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_params</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metrics</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
</li>
<li>
<p>向Ray Tune报告结果：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">report</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
</li>
</ul>
<p>记录训练参数、评估指标和模型大小等信息。</p>
</li>
</ol>
<p>函数<code>train_and_evaluate_model</code>通过MLflow追踪每次训练的过程和结果，同时通过Ray Tune进行超参数优化。每次训练都会：</p>
<ul>
<li>记录训练参数</li>
<li>保存训练数据信息</li>
<li>训练模型</li>
<li>评估模型性能</li>
<li>记录各种指标</li>
</ul>
<p>这样我们就可以通过MLflow的UI界面查看每次训练的详细信息，并通过Ray Tune找到最优的模型参数配置。</p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>训练文件路径</div><div class="admonitionContent_y0EO"><p>我们在使用Ray Tune进行训练的时候，需要注意Ray是一个分布式的训练引擎，
因此它有可能会在不同的主机上运行训练代码。</p><p>所以，我们需要确保训练的数据，在不同的主机上都可以被访问。</p><p>本文中，我们使用的是本地单机的Ray，所以我们使用了绝对路径来提供训练文件，如：
<code>/ray-tune/data/fasttext_train.txt</code>。</p></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="定义参数寻找范围">定义参数寻找范围<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E5%AE%9A%E4%B9%89%E5%8F%82%E6%95%B0%E5%AF%BB%E6%89%BE%E8%8C%83%E5%9B%B4" class="hash-link" aria-label="Direct link to 定义参数寻找范围" title="Direct link to 定义参数寻找范围">​</a></h3>
<p>我们有了上面的训练函数后，就可以通过定义Ray的参数范围，以及优化策略来训练我们的模型了。</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">tune_fasttext_parameters</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">input_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> test_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> num_samples</span><span class="token operator" style="color:#393A34">=</span><span class="token number" style="color:#36acaa">50</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    ray</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">init</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    config </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"input"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> input_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"epoch"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">50</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">300</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"lr"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">loguniform</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">1e-5</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1e-3</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic"># Reduced max learning rate</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"wordNgrams"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">4</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"dim"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">50</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"ws"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">3</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">7</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"minCount"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"minn"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"maxn"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"bucket"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">choice</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">50000</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">100000</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">200000</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"thread"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">4</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"loss"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"softmax"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"verbose"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    search_algo </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> OptunaSearch</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        metric</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"model_size_mb"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mode</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"max"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"min"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    analysis </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">lambda</span><span class="token plain"> trail_config</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> train_and_evaluate_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">trail_config</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> input_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> test_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        config</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        num_samples</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">num_samples</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        search_alg</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">search_algo</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        resources_per_trial</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">"cpu"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">4</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Get the best trial based on both metrics</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    best_trial </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> analysis</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_best_trial</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        metric</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic"># Primary metric</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mode</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"max"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        scope</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"last"</span><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic"># Consider only the last reported results</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"Best trial: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">best_trial</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> best_trial</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>函数<code>tune_fasttext_parameters</code>是用来配置和执行Ray Tune超参数优化的主要函数。其详细设置如下：</p>
<ol>
<li>
<p><strong>函数参数</strong>：</p>
<ul>
<li><code>input_path</code>：训练数据文件的路径</li>
<li><code>test_path</code>：测试数据文件的路径</li>
<li><code>num_samples</code>：超参数搜索的试验次数，默认为50次</li>
</ul>
</li>
<li>
<p><strong>Ray初始化</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">ray</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">init</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>初始化Ray运行时环境，为分布式训练做准备。</p>
</li>
<li>
<p><strong>超参数配置空间</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">config </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">"input"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> input_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">"epoch"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">50</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">300</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">"lr"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">loguniform</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">1e-5</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1e-3</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">"wordNgrams"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">randint</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">4</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>定义了fastText模型的各个超参数的搜索空间：</p>
<ul>
<li><code>epoch</code>：训练轮数，在50到300之间随机选择。</li>
<li><code>lr</code>：学习率，在1e-5到1e-3之间按对数均匀分布选择。</li>
<li><code>wordNgrams</code>：词组长度，在1到4之间选择。</li>
<li><code>dim</code>：词向量维度，在10到50之间选择。</li>
<li><code>ws</code>：上下文窗口大小，在3到7之间选择。</li>
<li><code>minCount</code>：最小词频，在2到10之间选择。</li>
<li><code>bucket</code>：哈希桶数量，在[50000, 100000, 200000]中选择，其数值较小可减少模型大小，但是容易产生冲突并损失特征信息。</li>
</ul>
</li>
<li>
<p><strong>搜索算法配置</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">search_algo </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> OptunaSearch</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    metric</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"model_size_mb"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mode</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"max"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"min"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>使用Optuna作为搜索算法，优化两个目标：</p>
<ul>
<li>最大化F1分数</li>
<li>最小化模型大小</li>
</ul>
</li>
<li>
<p><strong>执行超参数搜索</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">analysis </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">lambda</span><span class="token plain"> trail_config</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> train_and_evaluate_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">trail_config</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> input_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> test_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    config</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    num_samples</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">num_samples</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    search_alg</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">search_algo</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    resources_per_trial</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">"cpu"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">4</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>启动超参数搜索：</p>
<ul>
<li>每次试验都会调用<code>train_and_evaluate_model</code>函数</li>
<li>使用配置的参数空间进行搜索</li>
<li>执行指定次数的试验</li>
<li>每个试验分配4个CPU核心</li>
</ul>
</li>
<li>
<p><strong>获取最佳结果</strong>：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">best_trial </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> analysis</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_best_trial</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    metric</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mode</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"max"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    scope</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"last"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>从所有试验中选择最佳结果：</p>
<ul>
<li>主要基于F1分数进行选择</li>
<li>选择F1分数最高的试验</li>
<li>只考虑每个试验的最后一次结果</li>
</ul>
</li>
</ol>
<p>这个函数通过Ray Tune的超参数优化功能，自动搜索最佳的fastText模型参数配置。它不仅考虑了模型的性能（F1分数），
还考虑了模型的大小，这样可以在性能和资源消耗之间找到一个良好的平衡点。</p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>多目标优化</div><div class="admonitionContent_y0EO"><p>在本例中，我们使用了Optuna作为搜索算法，并设置了两个优化目标：F1分数和模型大小。
这种多目标优化可以帮助我们在模型性能和资源消耗之间找到更好的平衡。</p><p>但是在最终选择最佳试验时，我们仍然主要基于F1分数进行选择。这是因为在实际应用中，
我们通常会优先考虑模型的性能，只要模型大小在可接受的范围内即可。</p></div></div>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="可选步骤">可选步骤<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E5%8F%AF%E9%80%89%E6%AD%A5%E9%AA%A4" class="hash-link" aria-label="Direct link to 可选步骤" title="Direct link to 可选步骤">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="对训练异常进行处理">对训练异常进行处理<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E5%AF%B9%E8%AE%AD%E7%BB%83%E5%BC%82%E5%B8%B8%E8%BF%9B%E8%A1%8C%E5%A4%84%E7%90%86" class="hash-link" aria-label="Direct link to 对训练异常进行处理" title="Direct link to 对训练异常进行处理">​</a></h3>
<p>由于fastText在训练过程中，可能会报错。因此我们可以针对训练过程中的异常进行捕获，然后进行针对性的处理。</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">train_and_evaluate_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> input_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> test_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">try</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># Use the dictionary's attributes in the function call</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> fasttext</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">train_supervised</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">**</span><span class="token plain">config</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">save_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            model_size_mb </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> get_file_size_mb</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            num_of_samples</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> precision</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> recall </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">test</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">test_path</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            f1_score </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> precision </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> recall </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">precision </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> recall</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metrics</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> f1_score</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"precision"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> precision</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"recall"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> recall</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"model_size"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_output_matrix</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">size</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"model_size_mb"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model_size_mb</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">report</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> f1_score</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"precision"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> precision</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"recall"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> recall</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"model_size"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">get_output_matrix</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">size</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"model_size_mb"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model_size_mb</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"model_path"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">except</span><span class="token plain"> Exception </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> e</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"Error: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">e</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_param</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'error_message'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token builtin">str</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">e</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metric</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'failed'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            tune</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">report</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"f1_score"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"precision"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"recall"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"model_size"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">-</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"model_size_mb"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> sys</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">float_info</span><span class="token punctuation" style="color:#393A34">.</span><span class="token builtin">max</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token string" style="color:#e3116c">"model_path"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># ...</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>从上面的代码中，我们添加了<code>try...except</code>语句来处理异常，尤其是在异常后，
我们会记录异常的原因，同时利用<code>tune.report</code>来告知优化器，该参数组合可能会造成异常。
因此，应当尽量规避。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="记录模型">记录模型<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E8%AE%B0%E5%BD%95%E6%A8%A1%E5%9E%8B" class="hash-link" aria-label="Direct link to 记录模型" title="Direct link to 记录模型">​</a></h3>
<p>在训练并验证后，我们希望可以保存模型的模型文件，因此通过构造支持<code>mlflow.pyfunc.log_model</code>的包装器，
我们可以快捷的保存并注册模型文件。例如我们构造的包装器：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">class</span><span class="token plain"> </span><span class="token class-name">FastTextWrapper</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pyfunc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">PythonModel</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">load_context</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> context</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> fasttext</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">load_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">context</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">artifacts</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"fasttext_model"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">predict</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> context</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> model_input</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        predictions </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> text </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> model_input</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"text"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            label</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> prob </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">predict</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">text</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            predictions</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">"label"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> label</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"probability"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> prob</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> predictions</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>在模型训练后，我们可以及时注册模型文件，如下：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># Log the model as an artifact</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_artifact</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"model"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pyfunc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    artifact_path</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"fasttext_model"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    python_model</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">FastTextWrapper</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    artifacts</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"fasttext_model"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> model_path</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    registered_model_name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"fasttext_classifier"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>这样，在我们的MLflow系统中，我们就可以查看到某一个运行结果所对应的模型文件信息了。</p>
<p>而且，也方便我们后续使用响应的模型结果用于验证等目的。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="结果解析">结果解析<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E7%BB%93%E6%9E%9C%E8%A7%A3%E6%9E%90" class="hash-link" aria-label="Direct link to 结果解析" title="Direct link to 结果解析">​</a></h2>
<p>我们打开MLflow的后台，通过检索过滤满足：<code>metrics.model_size_mb &lt; 5</code>条件的记录，我们得到71个运行结果。如下图所示：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Result" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-result-2d49c485bc45296c94696af5f66fc7f3.png" width="1918" height="1078" class="img_e6Vo"></p>
<p>我们按照<code>model_size_mb</code>进行排序，可以找到最小模型大小的文件，然后我们通过检查<code>f1_score</code>的结果，可以获得相应的训练参数。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="总结">总结<a href="https://eason-projects.github.io/eason-blog/blog/fasttext#%E6%80%BB%E7%BB%93" class="hash-link" aria-label="Direct link to 总结" title="Direct link to 总结">​</a></h2>
<p>在本文中，我们通过使用Ray Tune以及MLflow，实现了针对fastText的训练。</p>
<p>同时借助参数寻找优化策略，我们可以快速的找到让我们满意的参数空间。</p>]]></content:encoded>
            <category>Machine Learning</category>
            <category>MLflow</category>
            <category>NLP</category>
        </item>
        <item>
            <title><![CDATA[MLflow保存与使用模型]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/mlflow-log-model</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/mlflow-log-model</guid>
            <pubDate>Sat, 08 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[在我们训练结束的时候，往往都会伴随着出来的模型的存放问题。]]></description>
            <content:encoded><![CDATA[<p>在我们训练结束的时候，往往都会伴随着出来的模型的存放问题。</p>
<p>较为简单的方案就是将模型存放在本地的某一个文件夹下。
但是这样并不利于模型结果的共享。</p>
<p>而MLflow提供了模型的保存功能，可以方便我们及时的将训练好的模型，上传到MLflow中。
以便后续继续深化应用。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="问题">问题<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E9%97%AE%E9%A2%98" class="hash-link" aria-label="Direct link to 问题" title="Direct link to 问题">​</a></h2>
<p>MLflow提供了多种的模型保存策略。
也就是说，针对不同框架产生的模型，MLflow提供了多种的方法来便捷的上传不同框架所产生的模型文件。</p>
<p>比如，针对<a href="https://scikit-learn.org/" target="_blank" rel="noopener noreferrer">scikit-learn</a>，
MLflow提供了<code>mlflow.sklearn.log_model</code>这样的方法来快速的提交模型，如代码：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># Log the sklearn model and register</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">sklearn</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    sk_model</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    artifact_path</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"sklearn-model"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    signature</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">signature</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    registered_model_name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"sk-learn-random-forest-reg-model"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>但是，对于我们的基于Stable-Baselines3，这样的强化学习框架来说，MLflow官方并没有提供相关的框架支持。</p>
<p>不过，MLflow提供了另外一种更为通用的方案，也就是通过<code>mlflow.pyfunc.log_model</code>这样的方法来上传模型。
下面，我们就来看看如何使用MLflow来记录Stable-Baselines3训练产生的模型文件。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="保存模型">保存模型<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E4%BF%9D%E5%AD%98%E6%A8%A1%E5%9E%8B" class="hash-link" aria-label="Direct link to 保存模型" title="Direct link to 保存模型">​</a></h2>
<p>在此文中，我们使用了编写类文件来继承<code>MLflow.pyfunc.PythonModel</code>这样的方式，来使用相关的能力。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="新建包装类">新建包装类<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E6%96%B0%E5%BB%BA%E5%8C%85%E8%A3%85%E7%B1%BB" class="hash-link" aria-label="Direct link to 新建包装类" title="Direct link to 新建包装类">​</a></h3>
<p>首先，针对我们的算法，编写一个新的包装类<code>MLflowDQNWrapper</code>，其定义如下：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> numpy </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> np</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> stable_baselines3 </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> DQN</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">class</span><span class="token plain"> </span><span class="token class-name">MLflowDQNWrapper</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pyfunc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">PythonModel</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">__init__</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">None</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">load_context</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> context</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># Load the model from the saved path</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> DQN</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">load</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">context</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">artifacts</span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"model_path"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">predict</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> context</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> model_input</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic"># Convert model_input to numpy array if it's a pandas DataFrame</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token builtin">hasattr</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_input</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'to_numpy'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            model_input </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> model_input</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">to_numpy</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        action</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> _states </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">predict</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">model_input</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> deterministic</span><span class="token operator" style="color:#393A34">=</span><span class="token boolean" style="color:#36acaa">True</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> action</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>在上面的定义中：</p>
<ul>
<li><code>load_context()</code>：此方法会在模型被加载的时候回调，此处会构造一个DQN的网络，使用存储的模型文件。</li>
<li><code>predict()</code>：此方法会在模型预测的时候被调用，我们可以看到此处会调用模型本身的<code>predict</code>方法来进行预测。</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="上传模型到mlflow">上传模型到MLflow<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E4%B8%8A%E4%BC%A0%E6%A8%A1%E5%9E%8B%E5%88%B0mlflow" class="hash-link" aria-label="Direct link to 上传模型到MLflow" title="Direct link to 上传模型到MLflow">​</a></h3>
<p>定义好我们的包装类后，我们在单个训练结束后，上传保存我们的模型。
相关代码如下：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic"># Get the best model path</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">best_model_path </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> os</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">path</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">join</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">best_model_save_path</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"best_model.zip"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Create and log the wrapped model</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">wrapped_model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> MLflowDQNWrapper</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">input_example </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">array</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0.0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic"># Log the model with MLflow</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pyfunc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    artifact_path</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"model"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    python_model</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">wrapped_model</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    artifacts</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">{</span><span class="token string" style="color:#e3116c">"model_path"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> best_model_path</span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    signature</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">models</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">signature</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">infer_signature</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        model_input</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">input_example</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        model_output</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">array</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    input_example</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">input_example</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    registered_model_name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">"cartpole-v1-dqn"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>我们依次来介绍一下上面的代码的工作：</p>
<ul>
<li><code>best_model_path</code>: 首先获取最佳模型的保存路径，这个路径指向了训练过程中保存的最佳模型文件。</li>
<li><code>wrapped_model = MLflowDQNWrapper()</code>: 创建我们之前定义的包装类的实例，这个包装类将帮助MLflow理解如何加载和使用我们的DQN模型。</li>
<li><code>input_example</code>: 创建一个示例输入数据，这里使用了一个形状为 (1, 4) 的numpy数组，对应CartPole环境的4个观测值。这个示例输入将帮助MLflow理解模型的输入格式。</li>
<li><code>mlflow.pyfunc.log_model()</code>: 使用MLflow的通用Python函数接口保存模型，其中：<!-- -->
<ul>
<li><code>artifact_path</code>: 指定模型在MLflow中的存储路径</li>
<li><code>python_model</code>: 指定我们的包装类实例</li>
<li><code>artifacts</code>: 定义模型相关的文件，这里包含了模型文件的路径</li>
<li><code>signature</code>: 通过<code>infer_signature</code>推断模型的输入输出签名，帮助MLflow验证数据格式</li>
<li><code>input_example</code>: 提供示例输入，帮助其他用户理解如何使用该模型</li>
<li><code>registered_model_name</code>: 在Model Registry中注册的模型名称</li>
</ul>
</li>
</ul>
<p>通过这种方式，我们就可以将Stable-Baselines3训练的DQN模型保存到MLflow中，并且可以方便地进行版本管理和部署。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="在mlflow中查看模型">在MLflow中查看模型<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E5%9C%A8mlflow%E4%B8%AD%E6%9F%A5%E7%9C%8B%E6%A8%A1%E5%9E%8B" class="hash-link" aria-label="Direct link to 在MLflow中查看模型" title="Direct link to 在MLflow中查看模型">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="模型查看">模型查看<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E6%A8%A1%E5%9E%8B%E6%9F%A5%E7%9C%8B" class="hash-link" aria-label="Direct link to 模型查看" title="Direct link to 模型查看">​</a></h3>
<p>当我们的模型上传成功后，我们可以在MLflow的后台看到我们之前创建的模型，以及相关的记录。</p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>创建一个模型还是多个模型？</div><div class="admonitionContent_y0EO"><p>对于类似多轮的训练来说，我们推荐使用同一个模型名来保存所有版本的模型。</p><p>在我们向同一个模型名（例如：<code>cartpole-v1-dqn</code>）上传新的版本的时候，新版本会自动编号。
因此我们只要重复向同一个模型上传新版本即可。</p></div></div>
<p>我们可以在前端界面中，很轻松的查看到所有的模型版本信息。如下图所示：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Registered Models" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-registered-models-eea68e22e6c08b5da6410d3d7fffa853.png" width="1146" height="819" class="img_e6Vo"></p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="设置别名">设置别名<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E8%AE%BE%E7%BD%AE%E5%88%AB%E5%90%8D" class="hash-link" aria-label="Direct link to 设置别名" title="Direct link to 设置别名">​</a></h3>
<p>当我们在Experiments模块，看到指标比较好的训练结果后，我们可以通过其关联的注册模型，打开相关的模型详情页面。</p>
<p>比如，我们看到按照我们设定的指标降序，第一个模型效果最好。如下图：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Experiments" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-experiments-478a1192832af523a168dfa2bda34cba.png" width="1390" height="578" class="img_e6Vo">
<em>训练列表页面</em></p>
<p>那我们打开相关的训练页面后，可以看到注册模型的连接。如下图：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Run" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-run-08170a09b9be7430196a0b324c17cb2b.png" width="1394" height="1018" class="img_e6Vo">
<em>训练详情页面</em></p>
<p>在点击相关的模型连接后，我们会跳转到对应的模型详情页。
在详情页面，我们单击<code>Aliases</code>可以来设置我们想要的别名。
比如在下图中，我们给该模型设置一个名为<code>champion</code>的别名，意为冠军（最好的模型）。</p>
<p><img decoding="async" loading="lazy" alt="MLflow Model Details" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-model-details-8c89e717fb63b1f5a87660f9fef1622d.png" width="1392" height="811" class="img_e6Vo">
<em>模型详情页页面</em></p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>别名的设置</div><div class="admonitionContent_y0EO"><p>一般来说，我们可以给最好的模型设置别名为"冠军"（<code>champion</code>），
而给一些潜在比较有价值的模型，待验证的设置"挑战者"(<code>challenger</code>)这样的别名。</p><p>但，具体名称的设置，可以按照具体的工作要求来进行灵活设置。</p></div></div>
<p>别名设置成功后，我们可以在后面的模型下载步骤中，下载对应别名的模型。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="下载使用模型">下载使用模型<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E4%B8%8B%E8%BD%BD%E4%BD%BF%E7%94%A8%E6%A8%A1%E5%9E%8B" class="hash-link" aria-label="Direct link to 下载使用模型" title="Direct link to 下载使用模型">​</a></h2>
<p>针对我们使用的Stable-Baselines3，我们新建一个新的方法来使用我们保存好的模型。</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> os</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> numpy </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> np</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> mlflow</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> gymnasium </span><span class="token keyword" style="color:#00009f">as</span><span class="token plain"> gym</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">from</span><span class="token plain"> tqdm </span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> tqdm</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">def</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">evaluate_mlflow</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Set up MLflow tracking URI</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_tracking_uri</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">MLFLOW_TRACKING_URI</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Create environment</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    env </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> CartPoleWrapper</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">gym</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">make</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"CartPole-v1"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> render_mode</span><span class="token operator" style="color:#393A34">=</span><span class="token boolean" style="color:#36acaa">None</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Load the model from MLflow Model Registry using the "Champion" stage</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    model_name </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"cartpole-v1-dqn"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    alias </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"champion"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    loaded_model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pyfunc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">load_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"models:/</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">model_name</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">@</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">alias</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Evaluate the model</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    n_eval_episodes </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">100</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    episode_rewards </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> episode </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> tqdm</span><span class="token punctuation" style="color:#393A34">(</span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">n_eval_episodes</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        obs</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> _ </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> env</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">reset</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        done </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token boolean" style="color:#36acaa">False</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        episode_reward </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">while</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> done</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token comment" style="color:#999988;font-style:italic"># Get prediction from MLflow model</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            action </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> loaded_model</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">predict</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">array</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">obs</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> dtype</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">float64</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            obs</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> reward</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> terminated</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> truncated</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> _ </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> env</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">step</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">action</span><span class="token punctuation" style="color:#393A34">[</span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            episode_reward </span><span class="token operator" style="color:#393A34">+=</span><span class="token plain"> reward</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            done </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> terminated </span><span class="token keyword" style="color:#00009f">or</span><span class="token plain"> truncated</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        episode_rewards</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">append</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">episode_reward</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mean_reward </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">mean</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">episode_rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    std_reward </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> np</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">std</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">episode_rewards</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"Mean reward: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">mean_reward</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">:</span><span class="token string-interpolation interpolation format-spec">.2f</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c"> +/- </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">std_reward</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">:</span><span class="token string-interpolation interpolation format-spec">.2f</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"Episode rewards: </span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">episode_rewards</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> mean_reward</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> std_reward</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>让我们来详细解释一下这段代码的工作原理：</p>
<ol>
<li>
<p>首先，我们设置MLflow的追踪URI，这样程序就知道从哪里下载模型：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_tracking_uri</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">MLFLOW_TRACKING_URI</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
</li>
<li>
<p>创建一个CartPole环境用于评估。这里我们设置<code>render_mode=None</code>因为我们只需要进行数值评估：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">env </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> CartPoleWrapper</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">gym</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">make</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"CartPole-v1"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> render_mode</span><span class="token operator" style="color:#393A34">=</span><span class="token boolean" style="color:#36acaa">None</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
</li>
<li>
<p>从MLflow加载模型。这里我们使用了模型名称和上面创建的别名来指定要加载的具体模型版本：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">model_name </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"cartpole-v1-dqn"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">alias </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"champion"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">loaded_model </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">pyfunc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">load_model</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-interpolation string" style="color:#e3116c">f"models:/</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">model_name</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">@</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">{</span><span class="token string-interpolation interpolation">alias</span><span class="token string-interpolation interpolation punctuation" style="color:#393A34">}</span><span class="token string-interpolation string" style="color:#e3116c">"</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>多种模型加载模型</div><div class="admonitionContent_y0EO"><p>除了使用别名（<code>alias</code>）来加载模型外，我们也可以使用模型的版本来加载模型。如：</p><div class="codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-text codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version}")</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div></div></div>
</li>
<li>
<p>进行模型评估：</p>
<ul>
<li>设置评估回合数为100次</li>
<li>对于每个回合：<!-- -->
<ul>
<li>重置环境获取初始观察值</li>
<li>循环直到回合结束（done为True）</li>
<li>使用模型预测动作并执行</li>
<li>累计奖励值</li>
</ul>
</li>
<li>最后计算平均奖励和标准差</li>
</ul>
</li>
</ol>
<p>这个评估函数可以帮助我们验证从MLflow下载的模型是否能够正常工作，
以及模型的性能如何。</p>
<p>通过运行多个回合并计算平均奖励，我们可以得到一个较为可靠的性能评估结果。</p>
<p>我们上面的模型，在100个回合后，均分都为最高分。
这说明我们之前训练和保存的模型性能都较为理想。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="总结">总结<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E6%80%BB%E7%BB%93" class="hash-link" aria-label="Direct link to 总结" title="Direct link to 总结">​</a></h2>
<p>在本文中，我们详细介绍了如何使用MLflow来管理和保存强化学习模型，
特别是针对Stable-Baselines3框架训练的模型。
主要内容包括：</p>
<ol>
<li>通过继承<code>MLflow.pyfunc.PythonModel</code>创建自定义包装类，使MLflow能够理解和加载我们的DQN模型。</li>
<li>使用<code>mlflow.pyfunc.log_model</code>方法将模型保存到MLflow中。</li>
<li>在MLflow界面中查看和管理模型版本，设置模型别名。</li>
<li>从MLflow中加载模型并进行评估。</li>
</ol>
<p>这种方式不仅提供了一个统一的模型管理方案，还能够方便地进行模型版本控制和部署。
通过MLflow的模型注册功能，我们可以更好地追踪和管理不同版本的模型，
为模型的迭代优化提供了可靠的基础。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="参考资料">参考资料<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-log-model#%E5%8F%82%E8%80%83%E8%B5%84%E6%96%99" class="hash-link" aria-label="Direct link to 参考资料" title="Direct link to 参考资料">​</a></h2>
<ul>
<li><a href="https://mlflow.org/docs/latest/model-registry.html#registering-an-unsupported-machine-learning-model" target="_blank" rel="noopener noreferrer">MLflow Model Registry - Registering an Unsupported Machine Learning Model</a></li>
</ul>]]></content:encoded>
            <category>Machine Learning</category>
            <category>MLflow</category>
        </item>
        <item>
            <title><![CDATA[MLflow使用PostgresSQL]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/mlflow-change-db</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/mlflow-change-db</guid>
            <pubDate>Wed, 05 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[MLflow默认使用的是SQLite来存储实验数据等信息。]]></description>
            <content:encoded><![CDATA[<p>MLflow默认使用的是SQLite来存储实验数据等信息。
但是当实验数量增大，后端的查询效率会下降。</p>
<p>在生产环境中，我们推荐使用更为健壮的数据库，比如PostgresSQL（下文中，简写为Postgres）。</p>
<p>在本文中，我们就描述了，如何替换MLflow默认的SQLite为Postgres来加速后端数据的处理能力。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="搭建postgres数据库">搭建Postgres数据库<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E6%90%AD%E5%BB%BApostgres%E6%95%B0%E6%8D%AE%E5%BA%93" class="hash-link" aria-label="Direct link to 搭建Postgres数据库" title="Direct link to 搭建Postgres数据库">​</a></h2>
<p>要使用Postgre后端数据库，我们首先需要搭建一个数据库的示例。
在本文，我们简单介绍如何在本地搭建一个Postgres数据库。</p>
<p>为了方便显示，我们此处使用Docker来快速搭建相关的数据库。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="安装docker">安装Docker<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%AE%89%E8%A3%85docker" class="hash-link" aria-label="Direct link to 安装Docker" title="Direct link to 安装Docker">​</a></h3>
<p>在搭建之前，请安装好Docker。具体请参考Docker官网
（<a href="https://www.docker.com/" target="_blank" rel="noopener noreferrer">https://www.docker.com</a>）进行安装。</p>
<p>在下文中，我们默认已经安装好了Docker环境。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="创建环境变量文件env">创建环境变量文件<code>.env</code><a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%88%9B%E5%BB%BA%E7%8E%AF%E5%A2%83%E5%8F%98%E9%87%8F%E6%96%87%E4%BB%B6env" class="hash-link" aria-label="Direct link to 创建环境变量文件env" title="Direct link to 创建环境变量文件env">​</a></h3>
<p>由于我们启动Docker的时候，希望自动加载我们预设好的密码信息。
因此，我们通过本地创建一个<code>.env</code>文件，其内容如下：</p>
<div class="language-plaintext codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-plaintext codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">POSTGRES_USER=username</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">POSTGRES_PASSWORD=password</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">POSTGRES_DB=mlflow</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>在上面的文件中，我们设置了默认的账户、密码以及数据库名称。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="启动docker镜像">启动Docker镜像<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%90%AF%E5%8A%A8docker%E9%95%9C%E5%83%8F" class="hash-link" aria-label="Direct link to 启动Docker镜像" title="Direct link to 启动Docker镜像">​</a></h3>
<p>在上面的文件准备好了之后，我们就可以通过如下的命令来启动我们的Postgres镜像了。</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">docker run --name postgres_mlflow \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    --env-file ./.env \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -v ./data:/var/lib/postgresql/data \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -e PGDATA=/var/lib/postgresql/data/db-files/ \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -p 5432:5432 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -d postgres</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>在上面的命令中，我们启动了一个名为<code>postgres_mlflow</code>的Docker容器，使用的是<code>postgres</code>官方最新镜像。</p>
<p>同时，我们通过命令<code>--env-file ./.env</code>将上面创建的<code>.env</code>环境变量进行了加载。</p>
<p>另外，我们还将本地的文件夹<code>./data</code>映射到了容器中的<code>/var/lib/postgresql/data</code>数据目录。
这样，我们的数据文件，在Docker容器关闭后，仍然可以保存。
以确保数据不会随着容器的启动和关闭而导致丢失。</p>
<p>最后，我们将容器的<code>5432</code>端口，映射到了宿主机的<code>5432</code>端口。
这样，我们就可以访问本地的<code>5432</code>端口来访问我们的数据库了。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="可选检查镜像启动情况">（可选）检查镜像启动情况<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%8F%AF%E9%80%89%E6%A3%80%E6%9F%A5%E9%95%9C%E5%83%8F%E5%90%AF%E5%8A%A8%E6%83%85%E5%86%B5" class="hash-link" aria-label="Direct link to （可选）检查镜像启动情况" title="Direct link to （可选）检查镜像启动情况">​</a></h3>
<p>在镜像启动后，我们可以通过<code>docker ps</code>来查看镜像的启动情况。
如下面的执行结果：</p>
<div class="language-plaintext codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-plaintext codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">CONTAINER ID   IMAGE      COMMAND                  CREATED         STATUS         PORTS                    NAMES</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">e0a13ac10ec2   postgres   "docker-entrypoint.s…"   5 seconds ago   Up 4 seconds   0.0.0.0:5432-&gt;5432/tcp   postgres_mlflow</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>从上面的结果中，我们可以看到，我们已经成功的启动了<code>postgres</code>镜像，其名称为：<code>postgres_mlflow</code>。
同时，我们宿主机的<code>5432</code>端口也映射到了容器中的<code>5432</code>TCP端口。</p>
<p>这也说明，我们的数据库在容器启动层面，是没有问题的。</p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>没有相关结果怎么办？</div><div class="admonitionContent_y0EO"><p>如果在<code>docker ps</code>命令中没有看到相关的输出，可能是由于异常原因导致的。
我们可以通过<code>docker ps -a</code>命令来查看所有的Docker容器的启动情况，看看是否有异常发生。</p><p>我们也可以通过如下命令来检查相关的容器失败原因：</p><ul>
<li><code>docker logs &lt;CONTAINER_ID&gt;</code>：此命令可以帮助我们查看到失败容器的日志信息。</li>
<li><code>docker inspect &lt;CONTAINER_ID&gt;</code>：此命令可以帮助我们查看相关容器的配置信息等。</li>
</ul></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="可选连接测试postgres数据库">（可选）连接测试Postgres数据库<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%8F%AF%E9%80%89%E8%BF%9E%E6%8E%A5%E6%B5%8B%E8%AF%95postgres%E6%95%B0%E6%8D%AE%E5%BA%93" class="hash-link" aria-label="Direct link to （可选）连接测试Postgres数据库" title="Direct link to （可选）连接测试Postgres数据库">​</a></h3>
<p>为了验证我们的数据库是否搭建完毕，我们可以启动
<a href="https://hub.docker.com/_/adminer/" target="_blank" rel="noopener noreferrer">Adminer</a> Docker镜像，来测试我们的数据库服务。
我们通过下面的命令，来启动相关的服务：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">docker run --link postgres_mlflow:db -p 8080:8080 -d adminer</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>上述命令中的<code>--link postgres_mlflow:db</code>的意思是，
我们希望将上面<code>postgres_mlflow</code>命名的容器以<code>db</code>的别名，
被我们新创建的<code>adminer</code>镜像容器来进行访问。</p>
<p>启动成功后，我们可以通过浏览器打开网址：
<a href="http://localhost:8080/" target="_blank" rel="noopener noreferrer">http://localhost:8080</a>来查看我们的数据库。</p>
<p>我们在登陆界面，我们选择<code>PostgreSQL</code>，并输入相关的登陆信息，如：</p>
<p><img decoding="async" loading="lazy" alt="Adminer Login UI" src="https://eason-projects.github.io/eason-blog/assets/images/adminer-login-ui-58053251c3e61e8222cf14771cfd6ac3.png" width="1194" height="682" class="img_e6Vo"></p>
<p>点击<code>Login</code>登陆按钮后，我们可以顺利看到我们的系统信息，如：</p>
<p><img decoding="async" loading="lazy" alt="Adminer UI" src="https://eason-projects.github.io/eason-blog/assets/images/adminer-ui-7c27eeb931684f4168abbc89de402004.png" width="1498" height="886" class="img_e6Vo"></p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="小结">小结<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%B0%8F%E7%BB%93" class="hash-link" aria-label="Direct link to 小结" title="Direct link to 小结">​</a></h3>
<p>以上，我们通过Docker容器的方式，顺利的搭建来本地的Postgres数据库。</p>
<p>接下来，我们就可以修改MLflow的后台服务器地址，来使用我们新搭建的Postgres数据库服务了。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="mlflow使用新的数据库">MLflow使用新的数据库<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#mlflow%E4%BD%BF%E7%94%A8%E6%96%B0%E7%9A%84%E6%95%B0%E6%8D%AE%E5%BA%93" class="hash-link" aria-label="Direct link to MLflow使用新的数据库" title="Direct link to MLflow使用新的数据库">​</a></h2>
<p>MLflow Server在启动的时候，有一个<code>--backend-store-uri</code>参数。
该参数默认使用本地一个名为<code>./mlruns</code>的文件夹，来存储所有的实验数据。</p>
<p>我们通过提供一个<code>SQLAlchemy</code>可接受的数据库连接字符串来让MLflow服务器连接我们的Postgres数据库。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="将env文件内容设置为环境变量">将<code>.env</code>文件内容设置为环境变量<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%B0%86env%E6%96%87%E4%BB%B6%E5%86%85%E5%AE%B9%E8%AE%BE%E7%BD%AE%E4%B8%BA%E7%8E%AF%E5%A2%83%E5%8F%98%E9%87%8F" class="hash-link" aria-label="Direct link to 将env文件内容设置为环境变量" title="Direct link to 将env文件内容设置为环境变量">​</a></h3>
<p>为了增加安全性，我们希望通过将<code>.env</code>文件内容暴露到环境变量中的方式，
让MLflow可以使用相关的数据库密码信息。
使用如下命令，我们可以将<code>.env</code>文件信息暴露：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">export $(cat .env | xargs)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>我们可以通过如下命令测试效果：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">echo $POSTGRES_USER</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>其输出应为：</p>
<div class="language-plaintext codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-plaintext codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">username</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="使用新的数据库">使用新的数据库<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E4%BD%BF%E7%94%A8%E6%96%B0%E7%9A%84%E6%95%B0%E6%8D%AE%E5%BA%93" class="hash-link" aria-label="Direct link to 使用新的数据库" title="Direct link to 使用新的数据库">​</a></h3>
<p>当我们的环境变量中，已经有我们需要的账号密码信息后，
我们可以通过命令连接我们的数据库：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">mlflow server \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    --host 0.0.0.0 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    --port 8081 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    --backend-store-uri postgresql://$POSTGRES_USER:$POSTGRES_PASSWORD@localhost:5432/mlflow</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>其中<code>--backend-store-uri</code>的参数的含义为：</p>
<p>使用账号（<code>$POSTGRES_USER</code>）以及密码（<code>$POSTGRES_PASSWORD</code>），
连接到本地（<code>localhost</code>）的<code>5432</code>端口的<code>mlflow</code>数据库。</p>
<p>在MLflow启动成功后，通过浏览器打开：<a href="http://localhost:8081/" target="_blank" rel="noopener noreferrer">http://localhost:8081</a>，
我们可以看到一个崭新的MLflow实例。
如下图所示：</p>
<p><img decoding="async" loading="lazy" alt="MLflow New Instance" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-new-instance-1a891951d6cf6a5b3367f9365ce47af4.png" width="2562" height="1420" class="img_e6Vo"></p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>之前的数据不见了？</div><div class="admonitionContent_y0EO"><p>如果之前有启动过MLflow的话，可以看到之前的记录都已经消失了。
这是因为我们启动了一个新的数据库，之前的记录在另外的地方。</p></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="可选使用docker运行mlflow">（可选）使用Docker运行MLflow<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%8F%AF%E9%80%89%E4%BD%BF%E7%94%A8docker%E8%BF%90%E8%A1%8Cmlflow" class="hash-link" aria-label="Direct link to （可选）使用Docker运行MLflow" title="Direct link to （可选）使用Docker运行MLflow">​</a></h3>
<div class="language-bash codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-bash codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">docker run -p 8090:5000 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    --link postgres_mlflow:db \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    --env-file .env \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    -v ./artifacts=/mlflow/artifacts \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    ghcr.io/mlflow/mlflow \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    bash -c "python3 -m pip install pip --upgrade &amp;&amp; python3 -m pip install psycopg2-binary &amp;&amp; mlflow server"</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>使用上面的代码，我们可以使用Docker来运行我们的MLFlow服务。</p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>请合理设置.env环境</div><div class="admonitionContent_y0EO"><p>在上面的代码中，我们需要合理的设置<code>.env</code>文件的内容，其可包括如下内容：</p><div class="language-env codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-env codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">POSTGRES_USER=****</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">POSTGRES_PASSWORD=****</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">POSTGRES_DB=mlflow</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">MLFLOW_BACKEND_STORE_URI=postgresql://****:****@db:5432/mlflow</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">MLFLOW_HOST=0.0.0.0</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">MLFLOW_PORT=5000</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="可选检查数据库">（可选）检查数据库<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E5%8F%AF%E9%80%89%E6%A3%80%E6%9F%A5%E6%95%B0%E6%8D%AE%E5%BA%93" class="hash-link" aria-label="Direct link to （可选）检查数据库" title="Direct link to （可选）检查数据库">​</a></h3>
<p>在MLflow启动后，我们可以再次打开之前创建的Adminer工具，来查看数据库是否运行正常。
我们选择<code>mlflow</code>数据库后，可以看到系统自动创建的表信息，如下图所示：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Tables" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-tables-185a66ddde3f153ed92d31921358958a.png" width="2166" height="1424" class="img_e6Vo"></p>
<p>这也说明，我们后续的MLflow数据会存储到相关的数据库表中。</p>
<p>如果我们要在后续进行数据库操作，可以登陆到数据库的后台，进行相关的操作。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="总结">总结<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-change-db#%E6%80%BB%E7%BB%93" class="hash-link" aria-label="Direct link to 总结" title="Direct link to 总结">​</a></h2>
<p>在本文中，我们通过Docker搭建了一个Postgres的数据库服务。
并调整了MLflow的启动方式，让其连接我们新的数据库来存储实验数据，确保可以长期的存储实验结果。</p>
<p>需要注意的是，我们在本文中的方法仅为演示目的。
如果是真正的生产环境，还需要搭建生产环境标准的数据库以及MLflow实例。</p>]]></content:encoded>
            <category>Machine Learning</category>
            <category>Database</category>
            <category>MLflow</category>
        </item>
        <item>
            <title><![CDATA[MLflow使用介绍]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/mlflow-intro</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/mlflow-intro</guid>
            <pubDate>Mon, 03 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[在我们使用Ray Tune的过程中，我们希望有一个开源且功能完备的实验追踪平台。]]></description>
            <content:encoded><![CDATA[<p>在我们使用Ray Tune的过程中，我们希望有一个开源且功能完备的实验追踪平台。
可以来帮助我们追踪训练过程中的调优的参数，以及每一个实验对应的最终指标结果等。</p>
<p>因此，我们尝试探索通过搭建本地的MLflow来进行相关的管理。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="mlflow的主要功能">MLflow的主要功能<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#mlflow%E7%9A%84%E4%B8%BB%E8%A6%81%E5%8A%9F%E8%83%BD" class="hash-link" aria-label="Direct link to MLflow的主要功能" title="Direct link to MLflow的主要功能">​</a></h2>
<p>MLflow提供了包含训练实验，模型注册管理部署等多种能力。</p>
<p>在本文中，我们主要会探索其训练记录相关的能力，用于配合Ray Tune寻找合适的强化学习训练参数。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="mlflow-tracking">MLflow Tracking<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#mlflow-tracking" class="hash-link" aria-label="Direct link to MLflow Tracking" title="Direct link to MLflow Tracking">​</a></h2>
<p>如下图所示，MLflow中的架构主要包含如下概念：</p>
<ul>
<li><strong>Experiments</strong>：所有的训练跟踪，都是在一个或多个实验中来进行追踪的。我们可以根据不同的工作，创建多个Experiments。</li>
<li><strong>Run</strong>：就是每一次的训练或者运行，每一个训练，关联一组配置或者指标，产生一个或者多个模型。</li>
</ul>
<p><img decoding="async" loading="lazy" alt="MLflow Tracking Architecture" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-tracking-architecture-44181b15c8af38a7a6abb25f05b9eb9d.png" width="1842" height="699" class="img_e6Vo">
<em>Source: <a href="https://mlflow.org/docs/latest/getting-started/index.html#mlflow-tracking" target="_blank" rel="noopener noreferrer">https://mlflow.org/docs/latest/getting-started/index.html#mlflow-tracking</a></em></p>
<p>在上面图示中，左侧为Experiments，右侧为对应Experiment的Runs。
点开某一个Run以后，可以看到对应的配置、指标以及模型等内容。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="运行mlflow">运行MLflow<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E8%BF%90%E8%A1%8Cmlflow" class="hash-link" aria-label="Direct link to 运行MLflow" title="Direct link to 运行MLflow">​</a></h2>
<p>在本文，我们简单设置一个本地的MLflow实例。
通过如下命令，我们可以安装最新版本的MLflow：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">pip install mlflow</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>安装成功后，我们通过如下命令，来启动我们的服务：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">mlflow server --host 0.0.0.0 --port 8080</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>MLflow网址和端口设置</div><div class="admonitionContent_y0EO"><p>上面命令中，启动了我们的MLflow在0.0.0.0网址上，
这个网址设置会让所有能访问到机器网络的设备访问到我们的MLflow。
生产环境中，请谨慎设置。</p><p>另，如果端口8080已经占用，可使用其他的端口设置。</p></div></div>
<p>服务器启动后，我们可以通过浏览器打开网址：<a href="http://localhost:8080/" target="_blank" rel="noopener noreferrer">http://localhost:8080</a>
来查看我们的MLflow服务和相关训练信息等。页面如下所示：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Landing Page" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-fresh-page-153d828ad7f4908b9b81ef98ab7fb086.png" width="1466" height="820" class="img_e6Vo"></p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="mlflow简单使用样例">MLflow简单使用样例<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#mlflow%E7%AE%80%E5%8D%95%E4%BD%BF%E7%94%A8%E6%A0%B7%E4%BE%8B" class="hash-link" aria-label="Direct link to MLflow简单使用样例" title="Direct link to MLflow简单使用样例">​</a></h2>
<p>我们用一个简单的代码示例来展示如何使用Python来保存训练的参数和结果。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="创建experiment">创建Experiment<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E5%88%9B%E5%BB%BAexperiment" class="hash-link" aria-label="Direct link to 创建Experiment" title="Direct link to 创建Experiment">​</a></h3>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> mlflow</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">MLFLOW_TRACKING_URI </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"http://localhost:8080"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_tracking_uri</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">MLFLOW_TRACKING_URI</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_experiment</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"experiment_1"</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>在上面的代码中，我们引入<code>mlflow</code>模块，然后使用我们前面设置好的MLflow本地服务器的地址，
后面我们的数据会发送到对应地址的服务器后台。</p>
<p>同时我们也创建了一个名为<code>experiment_1</code>的Experiment名称。
后面所有的Runs，都会保存在该名称下。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="设置run的参数">设置Run的参数<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E8%AE%BE%E7%BD%AErun%E7%9A%84%E5%8F%82%E6%95%B0" class="hash-link" aria-label="Direct link to 设置Run的参数" title="Direct link to 设置Run的参数">​</a></h3>
<p>对于我们一般的训练应用，我们需要记录该Run所关联的参数（比如超参数等信息）</p>
<p>我们通过如下的代码创建并记录相关的参数：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">start_run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">run_name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'Run Name #1'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Log with mlflow.log_param</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_param</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"param1"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.1283</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_param</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"param2"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.238292</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Log with mlflow.log_params</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_params</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"param3"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.392</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"param4"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.4829</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>上述代码中，我们通过：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">start_run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">run_name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'Run Name #1'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>来创建一个新的Run，并提供了一个Run Name：<code>Run Name #1</code>。</p>
<p>然后我们使用了两个不同的方式来记录参数：</p>
<ul>
<li><code>mlflow.log_param</code>： 通过Key Value的形式，记录某一个参数项目。</li>
<li><code>mlflow.log_params</code>：通过字典形式，来批量提供一组参数设置。</li>
</ul>
<p>相关的参数会记录在MLflow中，前端界面展示效果如下：</p>
<p><img decoding="async" loading="lazy" alt="Run Parameters" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-run-parameters-798afba2c34be0d86f41a507a1c0b384.png" width="1452" height="508" class="img_e6Vo"></p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="记录模型指标">记录模型指标<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E8%AE%B0%E5%BD%95%E6%A8%A1%E5%9E%8B%E6%8C%87%E6%A0%87" class="hash-link" aria-label="Direct link to 记录模型指标" title="Direct link to 记录模型指标">​</a></h3>
<p>当模型训练技术，以及在训练过程中，我们可以记录模型的指标，来追踪训练的过程，以及对比模型的效果。</p>
<p>类似参数的记录，模型指标的记录也有两个方法：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metric</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"metric_once"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">100</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metrics</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"metric_1"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1.1019</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"metric_2"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2.3829</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"metric_3"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.9842</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>既，通过如下两个方法：</p>
<ul>
<li><code>mlflow.log_metric</code>：记录某一个指标，也是按照Key Value形式进行记录。</li>
<li><code>mlflow.log_metrics</code>：记录一组指标，以字典的形式进行记录。</li>
</ul>
<p>另外，在训练过程中，我们可能希望通过多次记录的方式，来追踪相关指标的变化情况。
所以，我们通过如下的代码，来进行训练过程的示意：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> i </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metric</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"metric_update"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metric</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"metric_step"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">30</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> step</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metrics</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"metric_step_3"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">100</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"metric_step_4"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">200</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> step</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>在上述代码中，我们通过一个循环，来模拟训练过程的指标记录。
其中：</p>
<ul>
<li><code>metric_update</code>：该指标在每次的循环中，都进行了记录。（也就是说，我们可以调用多次的<code>log_metric</code>来记录指标的更新过程。</li>
<li><code>metric_step</code>：该指标除了多次记录外，还提供了<code>step</code>参数的设置，我们可以通过跟踪step的变化来追踪指标。</li>
<li><code>metric_step_3/4</code>：这几个指标示例了，如果通过字典以及step参数的方式，来跟踪多个指标的变化情况。</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="前端指标看板">前端指标看板<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E5%89%8D%E7%AB%AF%E6%8C%87%E6%A0%87%E7%9C%8B%E6%9D%BF" class="hash-link" aria-label="Direct link to 前端指标看板" title="Direct link to 前端指标看板">​</a></h3>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="指标汇总">指标汇总<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E6%8C%87%E6%A0%87%E6%B1%87%E6%80%BB" class="hash-link" aria-label="Direct link to 指标汇总" title="Direct link to 指标汇总">​</a></h4>
<p>在Run的Overview标签下，我们可以看到训练最新的指标情况，如下图所示：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Run - Overview - Metrics" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-run-overview-metrics-7646d64895909c98aa5f3b0ab574152a.png" width="1442" height="704" class="img_e6Vo"></p>
<p>我们通过单击相关的指标可以查看指标的详情信息。</p>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="只有一次记录的指标">只有一次记录的指标<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E5%8F%AA%E6%9C%89%E4%B8%80%E6%AC%A1%E8%AE%B0%E5%BD%95%E7%9A%84%E6%8C%87%E6%A0%87" class="hash-link" aria-label="Direct link to 只有一次记录的指标" title="Direct link to 只有一次记录的指标">​</a></h4>
<p>对于只记录了一次的指标，其UI展示效果如下：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Metrics Once" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-metrics-once-95ae47f55d7bd56480df0e1269a21203.png" width="2896" height="1606" class="img_e6Vo"></p>
<p>可以看到，指标的结果只有一个，所以其最新值，最小值和最大值，都是同一个数。</p>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="含有步数的指标">含有步数的指标<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E5%90%AB%E6%9C%89%E6%AD%A5%E6%95%B0%E7%9A%84%E6%8C%87%E6%A0%87" class="hash-link" aria-label="Direct link to 含有步数的指标" title="Direct link to 含有步数的指标">​</a></h4>
<p>对于含有步数（Step）的指标，我们通过调整<code>X-axis</code>为<code>Step</code>，可以看到根据步数的指标走势。
如下图所示：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Metrics - With Steps" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-metrics-with-steps-681be0461fb734fde3a4123ffc218c6c.png" width="2896" height="1716" class="img_e6Vo"></p>
<p>另外，我们可以看到步数的最新值、最小值和最大值，均不同，其展现了该指标的统计信息。</p>
<h4 class="anchor anchorWithStickyNavbar_n_Cs" id="模型指标看板">模型指标看板<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E6%A8%A1%E5%9E%8B%E6%8C%87%E6%A0%87%E7%9C%8B%E6%9D%BF" class="hash-link" aria-label="Direct link to 模型指标看板" title="Direct link to 模型指标看板">​</a></h4>
<p>我们通过Run的<code>Model metrics</code>标签页，可以看到所有标签的状态。
如下图所示：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Metrics - Tab" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-metrics-tab-093a2416abcb67021e12576e431cf5df.png" width="2892" height="1812" class="img_e6Vo"></p>
<p>我们通过过滤条件，也可以快速的过滤到我们关注的指标信息。
如下图所示，我们过滤以<code>metric_step</code>开始的指标：</p>
<p><img decoding="async" loading="lazy" alt="MLflow Metrics - Filter" src="https://eason-projects.github.io/eason-blog/assets/images/mlflow-metrics-filter-bb6b6356ee35bce5ee5a8cfd81a1f714.png" width="2896" height="1318" class="img_e6Vo"></p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="完整示例代码">完整示例代码<a href="https://eason-projects.github.io/eason-blog/blog/mlflow-intro#%E5%AE%8C%E6%95%B4%E7%A4%BA%E4%BE%8B%E4%BB%A3%E7%A0%81" class="hash-link" aria-label="Direct link to 完整示例代码" title="Direct link to 完整示例代码">​</a></h3>
<p>我们上述的内容的Python代码如下：</p>
<div class="language-python codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-python codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> mlflow</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">import</span><span class="token plain"> random</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">MLFLOW_TRACKING_URI </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"http://localhost:8080"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_tracking_uri</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">MLFLOW_TRACKING_URI</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">set_experiment</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"experiment_1"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">with</span><span class="token plain"> mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">start_run</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">run_name</span><span class="token operator" style="color:#393A34">=</span><span class="token string" style="color:#e3116c">'Run Name #1'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Log with mlflow.log_param</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_param</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"param1"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.1283</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_param</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"param2"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.238292</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic"># Log with mlflow.log_params</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_params</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"param3"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.392</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"param4"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.4829</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain">    </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metric</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"metric_once"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">100</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metrics</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"metric_1"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1.1019</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"metric_2"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2.3829</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"metric_3"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3.9842</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> i </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> </span><span class="token builtin">range</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metric</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"metric_update"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metric</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"metric_step"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">30</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> step</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        mlflow</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">log_metrics</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"metric_step_3"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">100</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token string" style="color:#e3116c">"metric_step_4"</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">200</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> i </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">random</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> step</span><span class="token operator" style="color:#393A34">=</span><span class="token plain">i</span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>]]></content:encoded>
            <category>Machine Learning</category>
            <category>MLflow</category>
        </item>
        <item>
            <title><![CDATA[Ray Tune]]></title>
            <link>https://eason-projects.github.io/eason-blog/blog/ray-tune</link>
            <guid>https://eason-projects.github.io/eason-blog/blog/ray-tune</guid>
            <pubDate>Mon, 03 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[我们在使用类似Stable-Baseline3之类的基于深度学习的强化学习训练框架的时候，]]></description>
            <content:encoded><![CDATA[<p>我们在使用类似Stable-Baseline3之类的基于深度学习的强化学习训练框架的时候，
难免需要进行比较多次数的实验和超参搜索。</p>
<p>那么，我们借助Ray Tune这样的框架，可以帮我们来实现对参数的搜索。</p>
<p>本文，即描述了如何通过使用Ray Tune及其框架和组件，来帮助我们实现快速的参数搜索。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="安装">安装<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#%E5%AE%89%E8%A3%85" class="hash-link" aria-label="Direct link to 安装" title="Direct link to 安装">​</a></h2>
<p>由于我们需要使用诸如Ray Dashboard这样的工具。
因此我们需要安装一些附加依赖项（如若不然，我们则无法正常使用类似功能）</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">pip install ray[tune,default]</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="ray-clusterray集群">Ray Cluster（Ray集群）<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#ray-clusterray%E9%9B%86%E7%BE%A4" class="hash-link" aria-label="Direct link to Ray Cluster（Ray集群）" title="Direct link to Ray Cluster（Ray集群）">​</a></h2>
<p>在我们使用Ray的过程中，官方推荐使用诸如云厂商的集群服务，但是为了演示目的。</p>
<p>我们使用如下的本地环境，构建一个本地的集群环境。</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">RAY_GRAFANA_HOST=http://localhost:3090 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">RAY_PROMETHEUS_HOST=http://localhost:9090 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">RAY_PROMETHEUS_NAME=Prometheus \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">RAY_GRAFANA_IFRAME_HOST=http://localhost:3090 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">ray start --head --port=6379 --dashboard-host=0.0.0.0 --dashboard-port=8000</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>上面的命令是我们启动了一个Head节点（也就是集群的主节点）。
然后，我们启动了Ray Dashboard（8000端口），同时对外暴露了我们的Dashboard服务
（此为演示目的，生产环境请酌情设置）</p>
<p>另外，<code>ray start</code>前的环境变量为我们的Metrics嵌入相关的设置，需要完成本文下述的相关的部分后，才可正常显示。</p>
<p>集群启动成功后，我们可以看到如下输出：</p>
<div class="codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-text codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">Local node IP: 127.0.0.1</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">--------------------</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Ray runtime started.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">--------------------</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Next steps</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  To connect to this Ray cluster:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    import ray</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    ray.init()</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  To submit a Ray job using the Ray Jobs CLI:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    RAY_ADDRESS='http://127.0.0.1:8000' ray job submit --working-dir . -- python my_script.py</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  for more information on submitting Ray jobs to the Ray cluster.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  To terminate the Ray runtime, run</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    ray stop</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  To view the status of the cluster, use</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    ray status</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  To monitor and debug Ray, view the dashboard at </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    127.0.0.1:8000</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  If connection to the dashboard fails, check your firewall settings and network configuration.</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>上面的结果，可以看到Dashboard相关的配置输出，这个证明我们的Ray Dashboard的设置是正确的。
我们可以通过浏览器访问上述地址： <a href="http://localhost:8000/" target="_blank" rel="noopener noreferrer">http://localhost:8000</a>。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="集群状态检查">集群状态检查<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#%E9%9B%86%E7%BE%A4%E7%8A%B6%E6%80%81%E6%A3%80%E6%9F%A5" class="hash-link" aria-label="Direct link to 集群状态检查" title="Direct link to 集群状态检查">​</a></h3>
<p>通过运行<code>ray status</code>命令，我们可以查看当前集群的状态。
执行命令后的结果如下：</p>
<div class="codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-text codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">======== Autoscaler status: 2025-02-03 04:21:48.303038 ========</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Node status</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">---------------------------------------------------------------</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Active:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> 1 node_e3eda3eac5760c03fe761332c030a854809df3ae366ba8e3e852573f</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Pending:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> (no pending nodes)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Recent failures:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> (no failures)</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Resources</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">---------------------------------------------------------------</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Usage:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> 0.0/12.0 CPU</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> 0B/11.89GiB memory</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> 0B/2.00GiB object_store_memory</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Demands:</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"> (no resource demands)</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>通过上面的命令，我们可以看到我们当前有一个激活节点（<code>node_e3eda3eac5760c03fe761332c030a854809df3ae366ba8e3e852573f</code>）。</p>
<p>因为没有任务启动，因此我们的资源使用情况是空的。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="集群停止">集群停止<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#%E9%9B%86%E7%BE%A4%E5%81%9C%E6%AD%A2" class="hash-link" aria-label="Direct link to 集群停止" title="Direct link to 集群停止">​</a></h3>
<p>通过<code>ray stop</code>，我们可以来停止集群的运行。</p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="ray-dashboard">Ray Dashboard<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#ray-dashboard" class="hash-link" aria-label="Direct link to Ray Dashboard" title="Direct link to Ray Dashboard">​</a></h2>
<p>默认状态下，我们是无法在Dashboard中查看时序序列的指标的。
因此，我们需要通过安装Grafana以及Prometheus来查看相关的时序数据状态。</p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>什么是Grafana和Prometheus？</div><div class="admonitionContent_y0EO"><ul>
<li><strong>Grafana</strong>是一款比较流行的指标展示系统（主要是前端展示），它可以非常快速的接入多种数据来源，并提供展示、告警等指标监控服务。</li>
<li><strong>Prometheus</strong>则提供了指标的采集以及查询等服务，可以快速的从不同来源的系统进行指标的采集，以及对外提供指标的查询服务。</li>
</ul></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="安装prometheus">安装Prometheus<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#%E5%AE%89%E8%A3%85prometheus" class="hash-link" aria-label="Direct link to 安装Prometheus" title="Direct link to 安装Prometheus">​</a></h3>
<p>通过执行下述命令，我们可以让Ray帮我们安装Prometheus组件：</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">ray metrics launch-prometheus</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>正常安装并运行成功后，我们可以看到如下的输出：</p>
<div class="codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-text codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">Downloaded: 105.84 MB / 105.84 MB</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Download completed.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">2025-02-03 04:32:09,998 - INFO - Prometheus installed successfully.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">2025-02-03 04:32:10,004 - INFO - Prometheus has started.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">Prometheus is running with PID 91623.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">To stop Prometheus, use the command: `ray metrics shutdown-prometheus`, 'kill 91623', or if you need to force stop, use 'kill -9 91623'.</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">To list all processes running Prometheus, use the command: 'ps aux | grep prometheus'.</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>我们也可以安装上面的输出，通过运行命令<code>ps aux | grep prometheus</code>来查看Prometheus相关的进程，并进行终止等动作。</p>
<p>安装并运行成功后，我们通过浏览器打开地址：<a href="http://localhost:9090/" target="_blank" rel="noopener noreferrer">http://localhost:9090</a>，
即可查看Prometheus的前端。</p>
<p>我们可以通过输入并查询<code>ray_node_cpu_utilization</code>等指标，来测试一下Metrics是否正常工作。</p>
<p>正常工作的状态，我们可以看到类似如下的输出：</p>
<p><img decoding="async" loading="lazy" alt="Prometheus Metrics Testing" src="https://eason-projects.github.io/eason-blog/assets/images/ray-dashboard-prometheus-metrics-testing-52cc72b465a00134830573c40608d7c2.png" width="3016" height="1524" class="img_e6Vo"></p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="安装grafana">安装Grafana<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#%E5%AE%89%E8%A3%85grafana" class="hash-link" aria-label="Direct link to 安装Grafana" title="Direct link to 安装Grafana">​</a></h3>
<p>安装Grafana的方法比较多，我们可以选择通过Grafana官方提供的不同操作系统的安装文件进行安装，也可以选择使用Docker来快速部署我们的Grafana服务。
此处，我们选择使用Docker来部署我们的Grafana服务。</p>
<div class="language-shell codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-shell codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">docker run -d \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  --name=grafana \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  -p 3090:3000 \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  -v grafana-storage:/var/lib/grafana \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  -e "GF_AUTH_ANONYMOUS_ENABLED=true" \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  -e "GF_SECURITY_ALLOW_EMBEDDING=true" \</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  grafana/grafana-oss</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>通过上面的命令，我们可以快速安装我们的Grafana服务，并在后台<code>3090</code>端口运行我们的服务。</p>
<p>通过浏览器打开：<a href="http://localhost:3090/" target="_blank" rel="noopener noreferrer">http://localhost:3090</a>，我们可以进行Grafana的安装配置等。</p>
<div class="theme-admonition theme-admonition-tip admonition_u5eB alert alert--success"><div class="admonitionHeading_grP_"><span class="admonitionIcon_Td3r"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>Grafana默认账户密码</div><div class="admonitionContent_y0EO"><p>Grafana的默认账户和密码都是：<code>admin</code>。可以在登陆成功后立即修改密码以保护系统安全。</p></div></div>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="给grafana添加数据源">给Grafana添加数据源<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#%E7%BB%99grafana%E6%B7%BB%E5%8A%A0%E6%95%B0%E6%8D%AE%E6%BA%90" class="hash-link" aria-label="Direct link to 给Grafana添加数据源" title="Direct link to 给Grafana添加数据源">​</a></h3>
<p>登陆成功后，通过：Connections -&gt; Data sources标签，我们添加Prometheus的源地址。
由于我们是通过Docker来启动的Grafana，因此如果要让Container访问我们Host的Prometheus服务，需要使用类似如下的地址：</p>
<div class="codeBlockContainer_pFCi theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_eaSu"><pre tabindex="0" class="prism-code language-text codeBlock_jhes thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_JxES"><span class="token-line" style="color:#393A34"><span class="token plain">http://host.docker.internal:9090</span><br></span></code></pre><div class="buttonGroup_fFgC"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_SvL5" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_X9sp"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_tmgA"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>具体的地址，可以根据真实的情况进行更新。</p>
<p>数据源的名称，我们则使用与<code>RAY_PROMETHEUS_NAME</code>一样的名称，默认为：<code>Prometheus</code>。</p>
<h3 class="anchor anchorWithStickyNavbar_n_Cs" id="给grafana添加dashboard">给Grafana添加Dashboard<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#%E7%BB%99grafana%E6%B7%BB%E5%8A%A0dashboard" class="hash-link" aria-label="Direct link to 给Grafana添加Dashboard" title="Direct link to 给Grafana添加Dashboard">​</a></h3>
<p>在添加数据源后，我们可以通过复制并导入<code>/tmp/ray/session_latest/metrics/grafana/dashboards</code> 目录下的4个JSON文件，来快速创建Ray Cluster相关的数据面板。</p>
<p>导入成功后，我们将会看到4个面板：</p>
<ul>
<li>Data Dashboard</li>
<li>Default Dashboard</li>
<li>Serve Dashboard</li>
<li>Serve Deployment Dashboard</li>
</ul>
<p>如Default Dashboard的监控情况：</p>
<p><img decoding="async" loading="lazy" alt="Grafana Dashboard" src="https://eason-projects.github.io/eason-blog/assets/images/grafna-dashboard-0bb1f6fb746eb38b46471a88b931c504.png" width="3018" height="1524" class="img_e6Vo"></p>
<h2 class="anchor anchorWithStickyNavbar_n_Cs" id="参考资料">参考资料<a href="https://eason-projects.github.io/eason-blog/blog/ray-tune#%E5%8F%82%E8%80%83%E8%B5%84%E6%96%99" class="hash-link" aria-label="Direct link to 参考资料" title="Direct link to 参考资料">​</a></h2>
<ul>
<li><a href="https://docs.ray.io/en/latest/cluster/metrics.html" target="_blank" rel="noopener noreferrer">Collecting and monitoring metrics</a> (Ray官方关于安装Prometheus以及Grafana的文档)</li>
</ul>]]></content:encoded>
            <category>Reinforcement Learning</category>
        </item>
    </channel>
</rss>