普通视图

发现新文章,点击刷新页面。
昨天以前IPhysResearch

ICA

2001年1月1日 08:00
  • 技术神秘化的去魅:Sora关键技术逆向工程图解:https://zhuanlan.zhihu.com/p/687928845
    • 这里以通俗易懂的方式来分析Sora的可能做法,包括它的整体结构以及关键组件。我希望即使您不太懂技术,也能大致看明白Sora的可能做法,所以画了几十张图来让看似复杂的机制更好理解,如果您看完对某部分仍不理解,那是我的问题。
  • 【Lite-Sora是一个旨在复制 Sora 的开源项目,致力于通过简洁易懂的代码,探索和提高视频生成算法的基础框架】‘Lite-Sora - An initiative to replicate Sora’ GitHub: https://github.com/modelscope/lite-sora

https://iphysresearch.github.io/blog/post/math/

2001年1月1日 08:00

https://textbooks.math.gatech.edu/ila/

【免费书稿《现代统计学入门(第二版)》】《Introduction to Modern Statistics (2nd Ed)》https://openintro-ims2.netlify.app/

【线性代数的艺术可视化图释中文版】‘The-Art-of-Linear-Algebra - Graphic notes on Gilbert Strang’s “Linear Algebra for Everyone”’ Kefang Liu GitHub: github.com/kf-liu/The-Art-of-Linear-Algebra-zh-CN

面向开发者的 ChatGPT 提示工程

2023年5月14日 08:00

这是关于 DeepLearning.ai 联合 OpenAI 推出《面向开发者的 ChatGPT 提示工程》教程相关的资料整理,讲师为 DeepLearning 创始人吴恩达以及 OpenAI 开发者 Isa Fulford。 视频字幕文件已开源至 GitHub,这个帖子几乎完全参考了 Datawhale 的 GitHub 上的整理,以方便我个人学习和资料查找。


项目简介

吴恩达《ChatGPT Prompt Engineering for Developers》课程中文版,主要内容为指导开发者如何构建 Prompt 并基于 OpenAI API 构建新的、基于 LLM 的应用,包括:

  • 书写 Prompt 的原则;
  • 文本总结(如总结用户评论);
  • 文本推断(如情感分类、主题提取);
  • 文本转换(如翻译、自动纠错);
  • 扩展(如书写邮件);

英文原版地址:ChatGPT Prompt Engineering for Developers

中文字幕视频地址:吴恩达 x OpenAI的Prompt Engineering课程专业翻译版

中英双语字幕下载:《ChatGPT提示工程》非官方版中英双语字幕

项目意义

LLM 正在逐步改变人们的生活,而对于开发者,如何基于 LLM 提供的 API 快速、便捷地开发一些具备更强能力、集成LLM 的应用,来便捷地实现一些更新颖、更实用的能力,是一个急需学习的重要能力。由吴恩达老师与 OpenAI 合作推出的 《ChatGPT Prompt Engineering for Developers》教程面向入门 LLM 的开发者,深入浅出地介绍了对于开发者,如何构造 Prompt 并基于 OpenAI 提供的 API 实现包括总结、推断、转换等多种常用功能,是入门 LLM 开发的经典教程。因此,我们将该课程翻译为中文,并复现其范例代码,也为原视频增加了中文字幕,支持国内中文学习者直接使用,以帮助中文学习者更好地学习 LLM 开发。

项目受众

适用于所有具备基础 Python 能力,想要入门 LLM 的开发者。

项目亮点

《ChatGPT Prompt Engineering for Developers》作为由吴恩达老师与 OpenAI 联合推出的官方教程,在可预见的未来会成为 LLM 的重要入门教程,但是目前还只支持英文版且国内访问受限,打造中文版且国内流畅访问的教程具有重要意义。

内容大纲

目录:

  1. 简介 Introduction @邹雨衡
  2. Prompt 的构建原则 Guidelines @邹雨衡
  3. 如何迭代优化 Prompt Itrative @邹雨衡
  4. 文本总结 Summarizing @玉琳
  5. 文本推断 Inferring @长琴
  6. 文本转换 Transforming @玉琳
  7. 文本扩展 Expanding @邹雨衡
  8. 聊天机器人 Chatbot @长琴
  9. 总结 @长琴

中文字幕视频:吴恩达 x OpenAI的Prompt Engineering课程专业翻译版 @万礼行

致谢

核心贡献者


1. 简介

作者 吴恩达教授

欢迎来到本课程,我们将为开发人员介绍 ChatGPT 提示工程。本课程由 Isa Fulford 教授和我一起授课。Isa Fulford 是 OpenAI 的技术团队成员,曾开发过受欢迎的 ChatGPT 检索插件,并且在教授人们如何在产品中使用 LLM 或 LLM 技术方面做出了很大贡献。她还参与编写了教授人们使用 Prompt 的 OpenAI cookbook。

互联网上有很多有关提示的材料,例如《30 prompts everyone has to know》之类的文章。这些文章主要集中在 ChatGPT Web 用户界面上,许多人在使用它执行特定的、通常是一次性的任务。但是,我认为 LLM 或大型语言模型作为开发人员的更强大功能是使用 API 调用到 LLM,以快速构建软件应用程序。我认为这方面还没有得到充分的重视。实际上,我们在 DeepLearning.AI 的姊妹公司 AI Fund 的团队一直在与许多初创公司合作,将这些技术应用于许多不同的应用程序上。看到 LLM API 能够让开发人员非常快速地构建应用程序,这真是令人兴奋。

在本课程中,我们将与您分享一些可能性以及如何实现它们的最佳实践。

随着大型语言模型(LLM)的发展,LLM 大致可以分为两种类型,即基础LLM和指令微调LLM。基础LLM是基于文本训练数据,训练出预测下一个单词能力的模型,其通常是在互联网和其他来源的大量数据上训练的。例如,如果你以“从前有一只独角兽”作为提示,基础LLM可能会继续预测“生活在一个与所有独角兽朋友的神奇森林中”。但是,如果你以“法国的首都是什么”为提示,则基础LLM可能会根据互联网上的文章,将答案预测为“法国最大的城市是什么?法国的人口是多少?”,因为互联网上的文章很可能是有关法国国家的问答题目列表。

许多 LLMs 的研究和实践的动力正在指令调整的 LLMs 上。指令调整的 LLMs 已经被训练来遵循指令。因此,如果你问它,“法国的首都是什么?”,它更有可能输出“法国的首都是巴黎”。指令调整的 LLMs 的训练通常是从已经训练好的基本 LLMs 开始,该模型已经在大量文本数据上进行了训练。然后,使用输入是指令、输出是其应该返回的结果的数据集来对其进行微调,要求它遵循这些指令。然后通常使用一种称为 RLHF(reinforcement learning from human feedback,人类反馈强化学习)的技术进行进一步改进,使系统更能够有帮助地遵循指令。

因为指令调整的 LLMs 已经被训练成有益、诚实和无害的,所以与基础LLMs相比,它们更不可能输出有问题的文本,如有害输出。许多实际使用场景已经转向指令调整的LLMs。您在互联网上找到的一些最佳实践可能更适用于基础LLMs,但对于今天的大多数实际应用,我们建议将注意力集中在指令调整的LLMs上,这些LLMs更容易使用,而且由于OpenAI和其他LLM公司的工作,它们变得更加安全和更加协调。

因此,本课程将重点介绍针对指令调整 LLM 的最佳实践,这是我们建议您用于大多数应用程序的。在继续之前,我想感谢 OpenAI 和 DeepLearning.ai 团队为 Izzy 和我所提供的材料作出的贡献。我非常感激 OpenAI 的 Andrew Main、Joe Palermo、Boris Power、Ted Sanders 和 Lillian Weng,他们参与了我们的头脑风暴材料的制定和审核,为这个短期课程编制了课程大纲。我也感激 Deep Learning 方面的 Geoff Ladwig、Eddy Shyu 和 Tommy Nelson 的工作。

当您使用指令调整 LLM 时,请类似于考虑向另一个人提供指令,假设它是一个聪明但不知道您任务的具体细节的人。当 LLM 无法正常工作时,有时是因为指令不够清晰。例如,如果您说“请为我写一些关于阿兰·图灵的东西”,清楚表明您希望文本专注于他的科学工作、个人生活、历史角色或其他方面可能会更有帮助。更多的,您还可以指定文本采取像专业记者写作的语调,或者更像是您向朋友写的随笔。

当然,如果你想象一下让一位新毕业的大学生为你完成这个任务,你甚至可以提前指定他们应该阅读哪些文本片段来写关于 Alan Turing的文本,那么这能够帮助这位新毕业的大学生更好地成功完成这项任务。下一章你会看到如何让提示清晰明确,创建提示的一个重要原则,你还会从提示的第二个原则中学到给LLM时间去思考。

2 编写 Prompt 的原则

本章的主要内容为编写 Prompt 的原则,在本章中,我们将给出两个编写 Prompt 的原则与一些相关的策略,你将练习基于这两个原则来编写有效的 Prompt,从而便捷而有效地使用 LLM。

环境配置

本教程使用 OpenAI 所开放的 ChatGPT API,因此你需要首先拥有一个 ChatGPT 的 API_KEY(也可以直接访问官方网址在线测试),然后需要安装 openai 的第三方库

首先需要安装所需第三方库:

openai:

pip install openai

dotenv:

pip install -U python-dotenv
# 将自己的 API-KEY 导入系统环境变量
export OPENAI_API_KEY='sk-**'

在 Python 脚本里,可以这样加载:

import openai
import os
from dotenv import load_dotenv, find_dotenv
# 导入第三方库

_ = load_dotenv(find_dotenv())
# 读取系统中的环境变量
openai.api_key = os.getenv('OPENAI_API_KEY')

# or 设置 API_KEY
openai.organization = 'org-**'
openai.api_key = "sk-**"

我们将在后续课程中深入探究 OpenAI 提供的 ChatCompletion API 的使用方法,在此处,我们先将它封装成一个函数,你无需知道其内部机理,仅需知道调用该函数输入 Prompt 其将会给出对应的 Completion 即可。

# 一个封装 OpenAI 接口的函数,参数为 Prompt,返回对应结果
def get_completion(prompt, model="gpt-3.5-turbo"):
 '''
 prompt: 对应的提示
 model: 调用的模型,默认为 gpt-3.5-turbo(ChatGPT),有内测资格的用户可以选择 gpt-4
 '''
 messages = [{"role": "user", "content": prompt}]
 response = openai.ChatCompletion.create(
 model=model,
 messages=messages,
 temperature=0, # 模型输出的温度系数,控制输出的随机程度
 )
 # 调用 OpenAI 的 ChatCompletion 接口
 return response.choices[0].message["content"]

原则一: 编写清晰、具体的指令

你应该通过提供尽可能清晰和具体的指令来表达您希望模型执行的操作。这将引导模型给出正确的输出,并减少你得到无关或不正确响应的可能。编写清晰的指令不意味着简短的指令,因为在许多情况下,更长的提示实际上更清晰且提供了更多上下文,这实际上可能导致更详细更相关的输出。

策略一:使用分隔符清晰地表示输入的不同部分,分隔符可以是:```""<>\<tag><\tag>

你可以使用任何明显的标点符号将特定的文本部分与提示的其余部分分开。这可以是任何可以使模型明确知道这是一个单独部分的标记。使用分隔符是一种可以避免提示注入的有用技术。提示注入是指如果用户将某些输入添加到提示中,则可能会向模型提供与您想要执行的操作相冲突的指令,从而使其遵循冲突的指令而不是执行您想要的操作。即,输入里面可能包含其他指令,会覆盖掉你的指令。对此,使用分隔符是一个不错的策略。

以下是一个例子,我们给出一段话并要求 GPT 进行总结,在该示例中我们使用 ``` 来作为分隔符。

英语例子
# 中文版见下一个 cell
text = f"""
You should express what you want a model to do by \
providing instructions that are as clear and \
specific as you can possibly make them. \
This will guide the model towards the desired output, \
and reduce the chances of receiving irrelevant \
or incorrect responses. Don't confuse writing a \
clear prompt with writing a short prompt. \
In many cases, longer prompts provide more clarity \
and context for the model, which can lead to \
more detailed and relevant outputs.
"""
prompt = f"""
Summarize the text delimited by triple backticks \
into a single sentence.
```{text}```
"""
response = get_completion(prompt)
print(response)
Clear and specific instructions should be provided to guide a model towards the desired output, and longer prompts can provide more clarity and context for the model, leading to more detailed and relevant outputs.
中文例子
text = f"""
你应该提供尽可能清晰、具体的指示,以表达你希望模型执行的任务。\
这将引导模型朝向所需的输出,并降低收到无关或不正确响应的可能性。\
不要将写清晰的提示与写简短的提示混淆。\
在许多情况下,更长的提示可以为模型提供更多的清晰度和上下文信息,从而导致更详细和相关的输出。
"""
# 需要总结的文本内容
prompt = f"""
把用三个反引号括起来的文本总结成一句话。
```{text}```
"""
# 指令内容,使用 ``` 来分隔指令和待总结的内容
response = get_completion(prompt)
print(response)
提供清晰具体的指示,避免无关或不正确响应,不要混淆写清晰和写简短,更长的提示可以提供更多清晰度和上下文信息,导致更详细和相关的输出。

策略二:要求一个结构化的输出,可以是 JsonHTML 等格式

第二个策略是要求生成一个结构化的输出,这可以使模型的输出更容易被我们解析,例如,你可以在 Python 中将其读入字典或列表中。

在以下示例中,我们要求 GPT 生成三本书的标题、作者和类别,并要求 GPT 以 Json 的格式返回给我们,为便于解析,我们指定了 Json 的键。

英语例子
prompt = f"""
Generate a list of three made-up book titles along \
with their authors and genres.
Provide them in JSON format with the following keys:
book_id, title, author, genre.
"""
response = get_completion(prompt)
print(response)
[
 {
 "book_id": 1,
 "title": "The Lost City of Zorath",
 "author": "Aria Blackwood",
 "genre": "Fantasy"
 },
 {
 "book_id": 2,
 "title": "The Last Survivors",
 "author": "Ethan Stone",
 "genre": "Science Fiction"
 },
 {
 "book_id": 3,
 "title": "The Secret Life of Bees",
 "author": "Lila Rose",
 "genre": "Romance"
 }
]
中文例子
prompt = f"""
请生成包括书名、作者和类别的三本虚构书籍清单,\
并以 JSON 格式提供,其中包含以下键:book_id、title、author、genre。
"""
response = get_completion(prompt)
print(response)
{
 "books": [
 {
 "book_id": 1,
 "title": "The Shadow of the Wind",
 "author": "Carlos Ruiz Zafón",
 "genre": "Mystery"
 },
 {
 "book_id": 2,
 "title": "The Name of the Wind",
 "author": "Patrick Rothfuss",
 "genre": "Fantasy"
 },
 {
 "book_id": 3,
 "title": "The Hitchhiker's Guide to the Galaxy",
 "author": "Douglas Adams",
 "genre": "Science Fiction"
 }
 ]
}

策略三:要求模型检查是否满足条件

如果任务做出的假设不一定满足,我们可以告诉模型先检查这些假设,如果不满足,指示并停止执行。你还可以考虑潜在的边缘情况以及模型应该如何处理它们,以避免意外的错误或结果。

在如下示例中,我们将分别给模型两段文本,分别是制作茶的步骤以及一段没有明确步骤的文本。我们将要求模型判断其是否包含一系列指令,如果包含则按照给定格式重新编写指令,不包含则回答未提供步骤。

英语例子1
text_1 = f"""
Making a cup of tea is easy! First, you need to get some \
water boiling. While that's happening, \
grab a cup and put a tea bag in it. Once the water is \
hot enough, just pour it over the tea bag. \
Let it sit for a bit so the tea can steep. After a \
few minutes, take out the tea bag. If you \
like, you can add some sugar or milk to taste. \
And that's it! You've got yourself a delicious \
cup of tea to enjoy.
"""
prompt = f"""
You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions, \
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
Step N - …

If the text does not contain a sequence of instructions, \
then simply write \"No steps provided.\"

\"\"\"{text_1}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 1:")
print(response)
Completion for Text 1:
Step 1 - Get some water boiling.
Step 2 - Grab a cup and put a tea bag in it.
Step 3 - Once the water is hot enough, pour it over the tea bag.
Step 4 - Let it sit for a bit so the tea can steep.
Step 5 - After a few minutes, take out the tea bag.
Step 6 - Add some sugar or milk to taste.
Step 7 - Enjoy your delicious cup of tea!
英语例子2
text_2 = f"""
The sun is shining brightly today, and the birds are \
singing. It's a beautiful day to go for a \
walk in the park. The flowers are blooming, and the \
trees are swaying gently in the breeze. People \
are out and about, enjoying the lovely weather. \
Some are having picnics, while others are playing \
games or simply relaxing on the grass. It's a \
perfect day to spend time outdoors and appreciate the \
beauty of nature.
"""
prompt = f"""You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions, \
re-write those instructions in the following format:
Step 1 - ...
Step 2 - …
Step N - …

If the text does not contain a sequence of instructions, \
then simply write \"No steps provided.\"

\"\"\"{text_2}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 2:")
print(response)
Completion for Text 2:
No steps provided.
中文例子1
# 有步骤的文本
text_1 = f"""
泡一杯茶很容易。首先,需要把水烧开。\
在等待期间,拿一个杯子并把茶包放进去。\
一旦水足够热,就把它倒在茶包上。\
等待一会儿,让茶叶浸泡。几分钟后,取出茶包。\
如果你愿意,可以加一些糖或牛奶调味。\
就这样,你可以享受一杯美味的茶了。
"""
prompt = f"""
您将获得由三个引号括起来的文本。\
如果它包含一系列的指令,则需要按照以下格式重新编写这些指令:

第一步 - ...
第二步 - …
第N步 - …

如果文本中不包含一系列的指令,则直接写“未提供步骤”。"
\"\"\"{text_1}\"\"\"
"""
response = get_completion(prompt)
print("Text 1 的总结:")
print(response)
Text 1 的总结:
第一步 - 把水烧开。
第二步 - 拿一个杯子并把茶包放进去。
第三步 - 把烧开的水倒在茶包上。
第四步 - 等待几分钟,让茶叶浸泡。
第五步 - 取出茶包。
第六步 - 如果你愿意,可以加一些糖或牛奶调味。
第七步 - 就这样,你可以享受一杯美味的茶了。
中文例子2
# 无步骤的文本
text_2 = f"""
今天阳光明媚,鸟儿在歌唱。\
这是一个去公园散步的美好日子。\
鲜花盛开,树枝在微风中轻轻摇曳。\
人们外出享受着这美好的天气,有些人在野餐,有些人在玩游戏或者在草地上放松。\
这是一个完美的日子,可以在户外度过并欣赏大自然的美景。
"""
prompt = f"""
您将获得由三个引号括起来的文本。\
如果它包含一系列的指令,则需要按照以下格式重新编写这些指令:

第一步 - ...
第二步 - …
第N步 - …

如果文本中不包含一系列的指令,则直接写“未提供步骤”。"
\"\"\"{text_2}\"\"\"
"""
response = get_completion(prompt)
print("Text 2 的总结:")
print(response)
Text 2 的总结:
未提供步骤。

策略四:提供少量示例

即在要求模型执行实际任务之前,提供给它少量成功执行任务的示例。

例如,在以下的示例中,我们告诉模型其任务是以一致的风格回答问题,并先给它一个孩子和一个祖父之间的对话的例子。孩子说,“教我耐心”,祖父用这些隐喻回答。因此,由于我们已经告诉模型要以一致的语气回答,现在我们说“教我韧性”,由于模型已经有了这个少样本示例,它将以类似的语气回答下一个任务。

英语例子
prompt = f"""
Your task is to answer in a consistent style.

<child>: Teach me about patience.

<grandparent>: The river that carves the deepest \
valley flows from a modest spring; the \
grandest symphony originates from a single note; \
the most intricate tapestry begins with a solitary thread.

<child>: Teach me about resilience.
"""
response = get_completion(prompt)
print(response)
<grandparent>: Resilience is like a tree that bends with the wind but never breaks. It is the ability to bounce back from adversity and keep moving forward, even when things get tough. Just like a tree that grows stronger with each storm it weathers, resilience is a quality that can be developed and strengthened over time.
中文例子
prompt = f"""
你的任务是以一致的风格回答问题。

<孩子>: 教我耐心。

<祖父母>: 挖出最深峡谷的河流源于一处不起眼的泉眼;最宏伟的交响乐从单一的音符开始;最复杂的挂毯以一根孤独的线开始编织。

<孩子>: 教我韧性。
"""
response = get_completion(prompt)
print(response)
<祖父母>: 韧性就像是一棵树,它需要经历风吹雨打、寒冬酷暑,才能成长得更加坚强。在生活中,我们也需要经历各种挫折和困难,才能锻炼出韧性。记住,不要轻易放弃,坚持下去,你会发现自己变得更加坚强。

原则二: 给模型时间去思考

如果模型匆忙地得出了错误的结论,您应该尝试重新构思查询,请求模型在提供最终答案之前进行一系列相关的推理。换句话说,如果您给模型一个在短时间或用少量文字无法完成的任务,它可能会猜测错误。这种情况对人来说也是一样的。如果您让某人在没有时间计算出答案的情况下完成复杂的数学问题,他们也可能会犯错误。因此,在这些情况下,您可以指示模型花更多时间思考问题,这意味着它在任务上花费了更多的计算资源。

策略一:指定完成任务所需的步骤

接下来我们将通过给定一个复杂任务,给出完成该任务的一系列步骤,来展示这一策略的效果

首先我们描述了杰克和吉尔的故事,并给出一个指令。该指令是执行以下操作。首先,用一句话概括三个反引号限定的文本。第二,将摘要翻译成法语。第三,在法语摘要中列出每个名称。第四,输出包含以下键的 JSON 对象:法语摘要和名称数。然后我们要用换行符分隔答案。

英语例子1
text = f"""
In a charming village, siblings Jack and Jill set out on \
a quest to fetch water from a hilltop \
well. As they climbed, singing joyfully, misfortune \
struck—Jack tripped on a stone and tumbled \
down the hill, with Jill following suit. \
Though slightly battered, the pair returned home to \
comforting embraces. Despite the mishap, \
their adventurous spirits remained undimmed, and they \
continued exploring with delight.
"""
# example 1
prompt_1 = f"""
Perform the following actions:
1 - Summarize the following text delimited by triple \
backticks with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the following \
keys: french_summary, num_names.

Separate your answers with line breaks.

Text:
```{text}```
"""
response = get_completion(prompt_1)
print("Completion for prompt 1:")
print(response)
Completion for prompt 1:
Two siblings, Jack and Jill, go on a quest to fetch water from a well on a hilltop, but misfortune strikes and they both tumble down the hill, returning home slightly battered but with their adventurous spirits undimmed.

Deux frères et sœurs, Jack et Jill, partent en quête d'eau d'un puits sur une colline, mais un malheur frappe et ils tombent tous les deux de la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts.
Noms: Jack, Jill.

{
 "french_summary": "Deux frères et sœurs, Jack et Jill, partent en quête d'eau d'un puits sur une colline, mais un malheur frappe et ils tombent tous les deux de la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts.",
 "num_names": 2
}
中文例子1
text = f"""
在一个迷人的村庄里,兄妹杰克和吉尔出发去一个山顶井里打水。\
他们一边唱着欢乐的歌,一边往上爬,\
然而不幸降临——杰克绊了一块石头,从山上滚了下来,吉尔紧随其后。\
虽然略有些摔伤,但他们还是回到了温馨的家中。\
尽管出了这样的意外,他们的冒险精神依然没有减弱,继续充满愉悦地探索。
"""
# example 1
prompt_1 = f"""
执行以下操作:
1-用一句话概括下面用三个反引号括起来的文本。
2-将摘要翻译成法语。
3-在法语摘要中列出每个人名。
4-输出一个 JSON 对象,其中包含以下键:French_summary,num_names。

请用换行符分隔您的答案。

Text:
```{text}```
"""
response = get_completion(prompt_1)
print("prompt 1:")
print(response)
prompt 1:
1-兄妹在山顶井里打水时发生意外,但仍然保持冒险精神。
2-Dans un charmant village, les frère et sœur Jack et Jill partent chercher de l'eau dans un puits au sommet de la montagne. Malheureusement, Jack trébuche sur une pierre et tombe de la montagne, suivi de près par Jill. Bien qu'ils soient légèrement blessés, ils retournent chez eux chaleureusement. Malgré cet accident, leur esprit d'aventure ne diminue pas et ils continuent à explorer joyeusement.
3-Jack, Jill
4-{
 "French_summary": "Dans un charmant village, les frère et sœur Jack et Jill partent chercher de l'eau dans un puits au sommet de la montagne. Malheureusement, Jack trébuche sur une pierre et tombe de la montagne, suivi de près par Jill. Bien qu'ils soient légèrement blessés, ils retournent chez eux chaleureusement. Malgré cet accident, leur esprit d'aventure ne diminue pas et ils continuent à explorer joyeusement.",
 "num_names": 2
}
英语例子2
prompt_2 = f"""
Your task is to perform the following actions:
1 - Summarize the following text delimited by <> with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the
following keys: french_summary, num_names.

Use the following format:
Text: <text to summarize>
Summary: <summary>
Translation: <summary translation>
Names: <list of names in French summary>
Output JSON: <json with summary and num_names>

Text: <{text}>
"""
response = get_completion(prompt_2)
print("\nCompletion for prompt 2:")
print(response)
Completion for prompt 2:
Summary: 兄妹杰克和吉尔在山顶井里打水时发生意外,但他们仍然保持冒险精神继续探索。
Translation: Jack and Jill, deux frères et sœurs, ont eu un accident en allant chercher de l'eau dans un puits de montagne, mais ils ont continué à explorer avec un esprit d'aventure.
Names: Jack, Jill
Output JSON: {"french_summary": "Jack and Jill, deux frères et sœurs, ont eu un accident en allant chercher de l'eau dans un puits de montagne, mais ils ont continué à explorer avec un esprit d'aventure.", "num_names": 2}
中文例子2
prompt_2 = f"""
1-用一句话概括下面用<>括起来的文本。
2-将摘要翻译成英语。
3-在英语摘要中列出每个名称。
4-输出一个 JSON 对象,其中包含以下键:English_summary,num_names。

请使用以下格式:
文本:<要总结的文本>
摘要:<摘要>
翻译:<摘要的翻译>
名称:<英语摘要中的名称列表>
输出 JSON:<带有 English_summary 和 num_names 的 JSON>

Text: <{text}>
"""
response = get_completion(prompt_2)
print("\nprompt 2:")
print(response)
prompt 2:
摘要:兄妹杰克和吉尔在迷人的村庄里冒险,不幸摔伤后回到家中,但仍然充满冒险精神。
翻译:In a charming village, siblings Jack and Jill set out to fetch water from a mountaintop well. While climbing and singing, Jack trips on a stone and tumbles down the mountain, with Jill following closely behind. Despite some bruises, they make it back home safely. Their adventurous spirit remains undiminished as they continue to explore with joy.
名称:Jack,Jill
输出 JSON:{"English_summary": "In a charming village, siblings Jack and Jill set out to fetch water from a mountaintop well. While climbing and singing, Jack trips on a stone and tumbles down the mountain, with Jill following closely behind. Despite some bruises, they make it back home safely. Their adventurous spirit remains undiminished as they continue to explore with joy.", "num_names": 2}

策略二:指导模型在下结论之前找出一个自己的解法

有时候,在明确指导模型在做决策之前要思考解决方案时,我们会得到更好的结果。

接下来我们会给出一个问题和一个学生的解答,要求模型判断解答是否正确

英语例子1
prompt = f"""
Determine if the student's solution is correct or not.

Question:
I'm building a solar power installation and I need \
 help working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations
as a function of the number of square feet.

Student's Solution:
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
学生的解决方案:
设x为发电站的大小,单位为平方英尺。
费用:
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
"""
response = get_completion(prompt)
print(response)
The student's solution is correct.
中文例子1
prompt = f"""
判断学生的解决方案是否正确。

问题:
我正在建造一个太阳能发电站,需要帮助计算财务。

 土地费用为 100美元/平方英尺
 我可以以 250美元/平方英尺的价格购买太阳能电池板
 我已经谈判好了维护合同,每年需要支付固定的10万美元,并额外支付每平方英尺10美元
 作为平方英尺数的函数,首年运营的总费用是多少。

学生的解决方案:
设x为发电站的大小,单位为平方英尺。
费用:

 土地费用:100x
 太阳能电池板费用:250x
 维护费用:100,000美元+100x
 总费用:100x+250x+100,000美元+100x=450x+100,000美元
"""
response = get_completion(prompt)
print(response)
学生的解决方案是正确的。

但是注意,学生的解决方案实际上是错误的。

我们可以通过指导模型先自行找出一个解法来解决这个问题。

在接下来这个 Prompt 中,我们要求模型先自行解决这个问题,再根据自己的解法与学生的解法进行对比,从而判断学生的解法是否正确。同时,我们给定了输出的格式要求。通过明确步骤,让模型有更多时间思考,有时可以获得更准确的结果。在这个例子中,学生的答案是错误的,但如果我们没有先让模型自己计算,那么可能会被误导以为学生是正确的。

英语例子2
prompt = f"""
Your task is to determine if the student's solution \
is correct or not.
To solve the problem do the following:
- First, work out your own solution to the problem.
- Then compare your solution to the student's solution \
and evaluate if the student's solution is correct or not.
Don't decide if the student's solution is correct until
you have done the problem yourself.

Use the following format:
Question:
```
question here
```
Student's solution:
```
student's solution here
```
Actual solution:
```
steps to work out the solution and your solution here
```
Is the student's solution the same as actual solution \
just calculated:
```
yes or no
```
Student grade:
```
correct or incorrect
```

Question:
```
I'm building a solar power installation and I need help \
working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations \
as a function of the number of square feet.
```
Student's solution:
```
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
```
Actual solution:
"""
response = get_completion(prompt)
print(response)
Let x be the size of the installation in square feet.

Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 10x

Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000

Is the student's solution the same as actual solution just calculated:
No

Student grade:
Incorrect
中文例子2
prompt = f"""
请判断学生的解决方案是否正确,请通过如下步骤解决这个问题:

步骤:

 首先,自己解决问题。
 然后将你的解决方案与学生的解决方案进行比较,并评估学生的解决方案是否正确。在自己完成问题之前,请勿决定学生的解决方案是否正确。

使用以下格式:

 问题:问题文本
 学生的解决方案:学生的解决方案文本
 实际解决方案和步骤:实际解决方案和步骤文本
 学生的解决方案和实际解决方案是否相同:是或否
 学生的成绩:正确或不正确

问题:

 我正在建造一个太阳能发电站,需要帮助计算财务。
 - 土地费用为每平方英尺100美元
 - 我可以以每平方英尺250美元的价格购买太阳能电池板
 - 我已经谈判好了维护合同,每年需要支付固定的10万美元,并额外支付每平方英尺10美元
 作为平方英尺数的函数,首年运营的总费用是多少。

学生的解决方案:

 设x为发电站的大小,单位为平方英尺。
 费用:
 1. 土地费用:100x
 2. 太阳能电池板费用:250x
 3. 维护费用:100,000+100x
 总费用:100x+250x+100,000+100x=450x+100,000

实际解决方案和步骤:
"""
response = get_completion(prompt)
print(response)
正确的解决方案和步骤:
 1. 计算土地费用:100美元/平方英尺 * x平方英尺 = 100x美元
 2. 计算太阳能电池板费用:250美元/平方英尺 * x平方英尺 = 250x美元
 3. 计算维护费用:10万美元 + 10美元/平方英尺 * x平方英尺 = 10万美元 + 10x美元
 4. 计算总费用:100x美元 + 250x美元 + 10万美元 + 10x美元 = 360x + 10万美元

学生的解决方案和实际解决方案是否相同:否

学生的成绩:不正确

局限性

虚假知识:模型偶尔会生成一些看似真实实则编造的知识

如果模型在训练过程中接触了大量的知识,它并没有完全记住所见的信息,因此它并不很清楚自己知识的边界。这意味着它可能会尝试回答有关晦涩主题的问题,并编造听起来合理但实际上并不正确的答案。我们称这些编造的想法为幻觉。

例如在如下示例中,我们要求告诉我们 Boie 公司生产的 AeroGlide UltraSlim Smart Toothbrush 产品的信息,事实上,这个公司是真实存在的,但产品是编造的,模型则会一本正经地告诉我们编造的知识。

英语例子
prompt = f"""
Tell me about AeroGlide UltraSlim Smart Toothbrush by Boie
"""
response = get_completion(prompt)
print(response)
The AeroGlide UltraSlim Smart Toothbrush by Boie is a high-tech toothbrush that uses advanced sonic technology to provide a deep and thorough clean. It features a slim and sleek design that makes it easy to hold and maneuver, and it comes with a range of smart features that help you optimize your brushing routine.

One of the key features of the AeroGlide UltraSlim Smart Toothbrush is its advanced sonic technology, which uses high-frequency vibrations to break up plaque and bacteria on your teeth and gums. This technology is highly effective at removing even the toughest stains and buildup, leaving your teeth feeling clean and fresh.

In addition to its sonic technology, the AeroGlide UltraSlim Smart Toothbrush also comes with a range of smart features that help you optimize your brushing routine. These include a built-in timer that ensures you brush for the recommended two minutes, as well as a pressure sensor that alerts you if you're brushing too hard.

Overall, the AeroGlide UltraSlim Smart Toothbrush by Boie is a highly advanced and effective toothbrush that is perfect for anyone looking to take their oral hygiene to the next level. With its advanced sonic technology and smart features, it provides a deep and thorough clean that leaves your teeth feeling fresh and healthy.
中文例子
prompt = f"""
告诉我 Boie 公司生产的 AeroGlide UltraSlim Smart Toothbrush 的相关信息
"""
response = get_completion(prompt)
print(response)
Boie公司生产的AeroGlide UltraSlim Smart Toothbrush是一款智能牙刷,具有以下特点:

1. 超薄设计:刷头仅有0.8毫米的厚度,可以更容易地进入口腔深处,清洁更彻底。

2. 智能感应:牙刷配备了智能感应技术,可以自动识别刷头的位置和方向,确保每个部位都得到充分的清洁。

3. 高效清洁:牙刷采用了高速振动技术,每分钟可达到40000次,可以有效去除牙菌斑和污渍。

4. 轻松携带:牙刷采用了便携式设计,可以轻松放入口袋或旅行包中,随时随地进行口腔清洁。

5. 环保材料:牙刷采用了环保材料制造,不含有害物质,对环境友好。

总之,Boie公司生产的AeroGlide UltraSlim Smart Toothbrush是一款高效、智能、环保的牙刷,可以帮助用户轻松保持口腔健康。

模型会输出看上去非常真实的编造知识,这有时会很危险。因此,请确保使用我们在本节中介绍的一些技巧,以尝试在构建自己的应用程序时避免这种情况。这是模型已知的一个弱点,也是我们正在积极努力解决的问题。在你希望模型根据文本生成答案的情况下,另一种减少幻觉的策略是先要求模型找到文本中的任何相关引用,然后要求它使用这些引用来回答问题,这种追溯源文档的方法通常对减少幻觉非常有帮助。

说明:在本教程中,我们使用 \ 来使文本适应屏幕大小以提高阅读体验,GPT 并不受 \ 的影响,但在你调用其他大模型时,需额外考虑 \ 是否会影响模型性能

3. 迭代式提示开发

当使用 LLM 构建应用程序时,我从来没有在第一次尝试中就成功使用最终应用程序中所需的 Prompt。但这并不重要,只要您有一个好的迭代过程来不断改进您的 Prompt,那么你就能够得到一个适合任务的 Prompt。我认为在提示方面,第一次成功的几率可能会高一些,但正如上所说,第一个提示是否有效并不重要。最重要的是为您的应用程序找到有效提示的过程。

因此,在本章中,我们将以从产品说明书中生成营销文案这一示例,展示一些框架,以提示你思考如何迭代地分析和完善你的 Prompt。

如果您之前与我一起上过机器学习课程,您可能见过我使用的一张图表,说明了机器学习开发的流程。通常是先有一个想法,然后再实现它:编写代码,获取数据,训练模型,这会给您一个实验结果。然后您可以查看输出结果,进行错误分析,找出它在哪里起作用或不起作用,甚至可以更改您想要解决的问题的确切思路或方法,然后更改实现并运行另一个实验等等,反复迭代,以获得有效的机器学习模型。在编写 Prompt 以使用 LLM 开发应用程序时,这个过程可能非常相似,您有一个关于要完成的任务的想法,可以尝试编写第一个 Prompt,满足上一章说过的两个原则:清晰明确,并且给系统足够的时间思考。然后您可以运行它并查看结果。如果第一次效果不好,那么迭代的过程就是找出为什么指令不够清晰或为什么没有给算法足够的时间思考,以便改进想法、改进提示等等,循环多次,直到找到适合您的应用程序的 Prompt。

任务:从产品说明书生成一份营销产品描述

这里有一个椅子的产品说明书,描述说它是一个中世纪灵感家族的一部分,讨论了构造、尺寸、椅子选项、材料等等,产地是意大利。假设您想要使用这份说明书帮助营销团队为在线零售网站撰写营销式描述

英语例子
# 示例:产品说明书
fact_sheet_chair = """
OVERVIEW
- Part of a beautiful family of mid-century inspired office furniture,
including filing cabinets, desks, bookcases, meeting tables, and more.
- Several options of shell color and base finishes.
- Available with plastic back and front upholstery (SWC-100)
or full upholstery (SWC-110) in 10 fabric and 6 leather options.
- Base finish options are: stainless steel, matte black,
gloss white, or chrome.
- Chair is available with or without armrests.
- Suitable for home or business settings.
- Qualified for contract use.

CONSTRUCTION
- 5-wheel plastic coated aluminum base.
- Pneumatic chair adjust for easy raise/lower action.

DIMENSIONS
- WIDTH 53 CM | 20.87”
- DEPTH 51 CM | 20.08”
- HEIGHT 80 CM | 31.50”
- SEAT HEIGHT 44 CM | 17.32”
- SEAT DEPTH 41 CM | 16.14”

OPTIONS
- Soft or hard-floor caster options.
- Two choices of seat foam densities:
medium (1.8 lb/ft3) or high (2.8 lb/ft3)
- Armless or 8 position PU armrests

MATERIALS
SHELL BASE GLIDER
- Cast Aluminum with modified nylon PA6/PA66 coating.
- Shell thickness: 10 mm.
SEAT
- HD36 foam

COUNTRY OF ORIGIN
- Italy
"""
# 提示:基于说明书生成营销描述
prompt = f"""
Your task is to help a marketing team create a
description for a retail website of a product based
on a technical fact sheet.

Write a product description based on the information
provided in the technical specifications delimited by
triple backticks.

Technical specifications: ```{fact_sheet_chair}```
"""
response = get_completion(prompt)
print(response)
Introducing our stunning mid-century inspired office chair, the perfect addition to any home or business setting. Part of a beautiful family of office furniture, including filing cabinets, desks, bookcases, meeting tables, and more, this chair is available in several options of shell color and base finishes to suit your style. Choose from plastic back and front upholstery (SWC-100) or full upholstery (SWC-110) in 10 fabric and 6 leather options.

The chair is constructed with a 5-wheel plastic coated aluminum base and features a pneumatic chair adjust for easy raise/lower action. It is available with or without armrests and is qualified for contract use. The base finish options are stainless steel, matte black, gloss white, or chrome.

Measuring at a width of 53 cm, depth of 51 cm, and height of 80 cm, with a seat height of 44 cm and seat depth of 41 cm, this chair is designed for ultimate comfort. You can also choose between soft or hard-floor caster options and two choices of seat foam densities: medium (1.8 lb/ft3) or high (2.8 lb/ft3). The armrests are available in either an armless or 8 position PU option.

The materials used in the construction of this chair are of the highest quality. The shell base glider is made of cast aluminum with modified nylon PA6/PA66 coating and has a shell thickness of 10 mm. The seat is made of HD36 foam, ensuring maximum comfort and durability.

This chair is made in Italy and is the perfect combination of style and functionality. Upgrade your workspace with our mid-century inspired office chair today!
中文例子
# 示例:产品说明书
fact_sheet_chair = """
概述

 美丽的中世纪风格办公家具系列的一部分,包括文件柜、办公桌、书柜、会议桌等。
 多种外壳颜色和底座涂层可选。
 可选塑料前后靠背装饰(SWC-100)或10种面料和6种皮革的全面装饰(SWC-110)。
 底座涂层选项为:不锈钢、哑光黑色、光泽白色或铬。
 椅子可带或不带扶手。
 适用于家庭或商业场所。
 符合合同使用资格。

结构

 五个轮子的塑料涂层铝底座。
 气动椅子调节,方便升降。

尺寸

 宽度53厘米|20.87英寸
 深度51厘米|20.08英寸
 高度80厘米|31.50英寸
 座椅高度44厘米|17.32英寸
 座椅深度41厘米|16.14英寸

选项

 软地板或硬地板滚轮选项。
 两种座椅泡沫密度可选:中等(1.8磅/立方英尺)或高(2.8磅/立方英尺)。
 无扶手或8个位置PU扶手。

材料
外壳底座滑动件

 改性尼龙PA6/PA66涂层的铸铝。
 外壳厚度:10毫米。
 座椅
 HD36泡沫

原产国

 意大利
"""
# 示例:产品说明书
fact_sheet_chair = """
概述

 美丽的中世纪风格办公家具系列的一部分,包括文件柜、办公桌、书柜、会议桌等。
 多种外壳颜色和底座涂层可选。
 可选塑料前后靠背装饰(SWC-100)或10种面料和6种皮革的全面装饰(SWC-110)。
 底座涂层选项为:不锈钢、哑光黑色、光泽白色或铬。
 椅子可带或不带扶手。
 适用于家庭或商业场所。
 符合合同使用资格。

结构

 五个轮子的塑料涂层铝底座。
 气动椅子调节,方便升降。

尺寸

 宽度53厘米|20.87英寸
 深度51厘米|20.08英寸
 高度80厘米|31.50英寸
 座椅高度44厘米|17.32英寸
 座椅深度41厘米|16.14英寸

选项

 软地板或硬地板滚轮选项。
 两种座椅泡沫密度可选:中等(1.8磅/立方英尺)或高(2.8磅/立方英尺)。
 无扶手或8个位置PU扶手。

材料
外壳底座滑动件

 改性尼龙PA6/PA66涂层的铸铝。
 外壳厚度:10毫米。
 座椅
 HD36泡沫

原产国

 意大利
"""
# 提示:基于说明书创建营销描述
prompt = f"""
你的任务是帮助营销团队基于技术说明书创建一个产品的营销描述。

根据```标记的技术说明书中提供的信息,编写一个产品描述。

技术说明: ```{fact_sheet_chair}```
"""
response = get_completion(prompt)
print(response)
产品描述:

我们自豪地推出美丽的中世纪风格办公家具系列,其中包括文件柜、办公桌、书柜、会议桌等。我们的产品采用多种外壳颜色和底座涂层,以满足您的个性化需求。您可以选择塑料前后靠背装饰(SWC-100)或10种面料和6种皮革的全面装饰(SWC-110),以使您的办公室更加舒适和时尚。

我们的底座涂层选项包括不锈钢、哑光黑色、光泽白色或铬,以满足您的不同需求。椅子可带或不带扶手,适用于家庭或商业场所。我们的产品符合合同使用资格,为您提供更加可靠的保障。

我们的产品采用五个轮子的塑料涂层铝底座,气动椅子调节,方便升降。尺寸为宽度53厘米|20.87英寸,深度51厘米|20.08英寸,高度80厘米|31.50英寸,座椅高度44厘米|17.32英寸,座椅深度41厘米|16.14英寸,为您提供舒适的使用体验。

我们的产品还提供软地板或硬地板滚轮选项,两种座椅泡沫密度可选:中等(1.8磅/立方英尺)或高(2.8磅/立方英尺),以及无扶手或8个位置PU扶手,以满足您的不同需求。

我们的产品采用改性尼龙PA6/PA66涂层的铸铝外壳底座滑动件,外壳厚度为10毫米,座椅采用HD36泡沫,为您提供更加舒适的使用体验。我们的产品原产国为意大利,为您提供更加优质的品质保证。

问题一: 生成文本太长

它似乎很好地写了一个描述,介绍了一个惊人的中世纪灵感办公椅,很好地完成了要求,即从技术说明书开始编写产品描述。但是当我看到这个时,我会觉得这个太长了。

所以我有了一个想法。我写了一个提示,得到了结果。但是我对它不是很满意,因为它太长了,所以我会澄清我的提示,并说最多使用50个字。

因此,我通过要求它限制生成文本长度来解决这一问题

英语例子
# 优化后的 Prompt,要求生成描述不多于 50 词
prompt = f"""
Your task is to help a marketing team create a
description for a retail website of a product based
on a technical fact sheet.

Write a product description based on the information
provided in the technical specifications delimited by
triple backticks.

Use at most 50 words.

Technical specifications: ```{fact_sheet_chair}```
"""
response = get_completion(prompt)
print(response)
Introducing our beautiful medieval-style office furniture collection, including filing cabinets, desks, bookcases, and conference tables. Choose from a variety of shell colors and base coatings, with optional plastic or fabric/leather decoration. The chair features a plastic-coated aluminum base with five wheels and pneumatic height adjustment. Perfect for home or commercial use. Made in Italy.

取出回答并根据空格拆分,答案为54个字,较好地完成了我的要求

lst = response.split()
print(len(lst))
54
中文例子
# 优化后的 Prompt,要求生成描述不多于 50 词
prompt = f"""
您的任务是帮助营销团队基于技术说明书创建一个产品的零售网站描述。

根据```标记的技术说明书中提供的信息,编写一个产品描述。

使用最多50个词。

技术规格:```{fact_sheet_chair}```
"""
response = get_completion(prompt)
print(response)
中世纪风格办公家具系列,包括文件柜、办公桌、书柜、会议桌等。多种颜色和涂层可选,可带或不带扶手。底座涂层选项为不锈钢、哑光黑色、光泽白色或铬。适用于家庭或商业场所,符合合同使用资格。意大利制造。
# 由于中文需要分词,此处直接计算整体长度
len(response)
97

LLM在遵循非常精确的字数限制方面表现得还可以,但并不那么出色。有时它会输出60或65个单词的内容,但这还算是合理的。这原因是 LLM 解释文本使用一种叫做分词器的东西,但它们往往在计算字符方面表现一般般。有很多不同的方法来尝试控制你得到的输出的长度。

问题二: 文本关注在错误的细节上

我们会发现的第二个问题是,这个网站并不是直接向消费者销售,它实际上旨在向家具零售商销售家具,他们会更关心椅子的技术细节和材料。在这种情况下,你可以修改这个提示,让它更精确地描述椅子的技术细节。

解决方法:要求它专注于与目标受众相关的方面。

英语例子
# 优化后的 Prompt,说明面向对象,应具有什么性质且侧重于什么方面
prompt = f"""
Your task is to help a marketing team create a
description for a retail website of a product based
on a technical fact sheet.

Write a product description based on the information
provided in the technical specifications delimited by
triple backticks.

The description is intended for furniture retailers,
so should be technical in nature and focus on the
materials the product is constructed from.

Use at most 50 words.

Technical specifications: ```{fact_sheet_chair}```
"""
response = get_completion(prompt)
print(response)
Introducing our beautiful medieval-style office furniture collection, including file cabinets, desks, bookcases, and conference tables. Available in multiple shell colors and base coatings, with optional plastic or fabric/leather upholstery. Features a plastic-coated aluminum base with five wheels and pneumatic chair adjustment. Suitable for home or commercial use and made with high-quality materials, including cast aluminum with a modified nylon coating and HD36 foam. Made in Italy.
中文例子
# 优化后的 Prompt,说明面向对象,应具有什么性质且侧重于什么方面
prompt = f"""
您的任务是帮助营销团队基于技术说明书创建一个产品的零售网站描述。

根据```标记的技术说明书中提供的信息,编写一个产品描述。

该描述面向家具零售商,因此应具有技术性质,并侧重于产品的材料构造。

使用最多50个单词。

技术规格: ```{fact_sheet_chair}```
"""
response = get_completion(prompt)
print(response)
这款中世纪风格办公家具系列包括文件柜办公桌书柜和会议桌等适用于家庭或商业场所可选多种外壳颜色和底座涂层底座涂层选项为不锈钢哑光黑色光泽白色或铬椅子可带或不带扶手可选软地板或硬地板滚轮两种座椅泡沫密度可选外壳底座滑动件采用改性尼龙PA6/PA66涂层的铸铝座椅采用HD36泡沫原产国为意大利

我可能进一步想要在描述的结尾包括产品ID。因此,我可以进一步改进这个提示,要求在描述的结尾,包括在技术说明中的每个7个字符产品ID。

英语例子
# 更进一步,要求在描述末尾包含 7个字符的产品ID
prompt = f"""
Your task is to help a marketing team create a
description for a retail website of a product based
on a technical fact sheet.

Write a product description based on the information
provided in the technical specifications delimited by
triple backticks.

The description is intended for furniture retailers,
so should be technical in nature and focus on the
materials the product is constructed from.

At the end of the description, include every 7-character
Product ID in the technical specification.

Use at most 50 words.

Technical specifications: ```{fact_sheet_chair}```
"""
response = get_completion(prompt)
print(response)
Introducing our beautiful medieval-style office furniture collection, featuring file cabinets, desks, bookshelves, and conference tables. Available in multiple shell colors and base coatings, with optional plastic or fabric/leather decorations. The chair comes with or without armrests and has a plastic-coated aluminum base with five wheels and pneumatic height adjustment. Suitable for home or commercial use. Made in Italy.

Product IDs: SWC-100, SWC-110
中文例子
# 更进一步
prompt = f"""
您的任务是帮助营销团队基于技术说明书创建一个产品的零售网站描述。

根据```标记的技术说明书中提供的信息,编写一个产品描述。

该描述面向家具零售商,因此应具有技术性质,并侧重于产品的材料构造。

在描述末尾,包括技术规格中每个7个字符的产品ID。

使用最多50个单词。

技术规格: ```{fact_sheet_chair}```
"""
response = get_completion(prompt)
print(response)
这款中世纪风格的办公家具系列包括文件柜、办公桌、书柜和会议桌等,适用于家庭或商业场所。可选多种外壳颜色和底座涂层,底座涂层选项为不锈钢、哑光黑色、光泽白色或铬。椅子可带或不带扶手,可选塑料前后靠背装饰或10种面料和6种皮革的全面装饰。座椅采用HD36泡沫,可选中等或高密度,座椅高度44厘米,深度41厘米。外壳底座滑动件采用改性尼龙PA6/PA66涂层的铸铝,外壳厚度为10毫米。原产国为意大利。产品ID:SWC-100/SWC-110。

问题三: 需要一个表格形式的描述

以上是许多开发人员通常会经历的迭代提示开发的简短示例。我的建议是,像上一章中所演示的那样,Prompt 应该保持清晰和明确,并在必要时给模型一些思考时间。在这些要求的基础上,通常值得首先尝试编写 Prompt ,看看会发生什么,然后从那里开始迭代地完善 Prompt,以逐渐接近所需的结果。因此,许多成功的Prompt都是通过这种迭代过程得出的。我将向您展示一个更复杂的提示示例,可能会让您对ChatGPT的能力有更深入的了解。

这里我添加了一些额外的说明,要求它抽取信息并组织成表格,并指定表格的列、表名和格式,还要求它将所有内容格式化为可以在网页使用的 HTML。

英语例子
# 要求它抽取信息并组织成表格,并指定表格的列、表名和格式
prompt = f"""
Your task is to help a marketing team create a
description for a retail website of a product based
on a technical fact sheet.

Write a product description based on the information
provided in the technical specifications delimited by
triple backticks.

The description is intended for furniture retailers,
so should be technical in nature and focus on the
materials the product is constructed from.

At the end of the description, include every 7-character
Product ID in the technical specification.

After the description, include a table that gives the
product's dimensions. The table should have two columns.
In the first column include the name of the dimension.
In the second column include the measurements in inches only.

Give the table the title 'Product Dimensions'.

Format everything as HTML that can be used in a website.
Place the description in a <div> element.

Technical specifications: ```{fact_sheet_chair}```
"""

response = get_completion(prompt)
print(response)
<div>
 <p>Introducing our beautiful collection of medieval-style office furniture, including file cabinets, desks, bookcases, and conference tables. Choose from a variety of shell colors and base coatings. You can opt for plastic front and backrest decoration (SWC-100) or full decoration with 10 fabrics and 6 leathers (SWC-110). Base coating options include stainless steel, matte black, glossy white, or chrome. The chair is available with or without armrests and is suitable for both home and commercial settings. It is contract eligible.</p>
 <p>The structure features a plastic-coated aluminum base with five wheels. The chair is pneumatically adjustable for easy height adjustment.</p>
 <p>Product IDs: SWC-100, SWC-110</p>
 <table>
 <caption>Product Dimensions</caption>
 <tr>
 <td>Width</td>
 <td>20.87 inches</td>
 </tr>
 <tr>
 <td>Depth</td>
 <td>20.08 inches</td>
 </tr>
 <tr>
 <td>Height</td>
 <td>31.50 inches</td>
 </tr>
 <tr>
 <td>Seat Height</td>
 <td>17.32 inches</td>
 </tr>
 <tr>
 <td>Seat Depth</td>
 <td>16.14 inches</td>
 </tr>
 </table>
 <p>Options include soft or hard floor casters. You can choose from two seat foam densities: medium (1.8 pounds/cubic foot) or high (2.8 pounds/cubic foot). The chair is available with or without 8-position PU armrests.</p>
 <p>Materials:</p>
 <ul>
 <li>Shell, base, and sliding parts: cast aluminum coated with modified nylon PA6/PA66. Shell thickness: 10mm.</li>
 <li>Seat: HD36 foam</li>
 </ul>
 <p>Made in Italy.</p>
</div>
# 表格是以 HTML 格式呈现的,加载出来
from IPython.display import display, HTML

display(HTML(response))

中文例子
# 要求它抽取信息并组织成表格,并指定表格的列、表名和格式
prompt = f"""
您的任务是帮助营销团队基于技术说明书创建一个产品的零售网站描述。

根据```标记的技术说明书中提供的信息,编写一个产品描述。

该描述面向家具零售商,因此应具有技术性质,并侧重于产品的材料构造。

在描述末尾,包括技术规格中每个7个字符的产品ID。

在描述之后,包括一个表格,提供产品的尺寸。表格应该有两列。第一列包括尺寸的名称。第二列只包括英寸的测量值。

给表格命名为“产品尺寸”。

将所有内容格式化为可用于网站的HTML格式。将描述放在<div>元素中。

技术规格:```{fact_sheet_chair}```
"""

response = get_completion(prompt)
print(response)
<div>
<h2>中世纪风格办公家具系列椅子</h2>
<p>这款椅子是中世纪风格办公家具系列的一部分,适用于家庭或商业场所。它有多种外壳颜色和底座涂层可选,包括不锈钢、哑光黑色、光泽白色或铬。您可以选择带或不带扶手的椅子,以及软地板或硬地板滚轮选项。此外,您可以选择两种座椅泡沫密度:中等(1.8磅/立方英尺)或高(2.8磅/立方英尺)。</p>
<p>椅子的外壳底座滑动件是改性尼龙PA6/PA66涂层的铸铝,外壳厚度为10毫米。座椅采用HD36泡沫,底座是五个轮子的塑料涂层铝底座,可以进行气动椅子调节,方便升降。此外,椅子符合合同使用资格,是您理想的选择。</p>
<p>产品ID:SWC-100</p>
</div>

<table>
 <caption>产品尺寸</caption>
 <tr>
 <th>宽度</th>
 <td>20.87英寸</td>
 </tr>
 <tr>
 <th>深度</th>
 <td>20.08英寸</td>
 </tr>
 <tr>
 <th>高度</th>
 <td>31.50英寸</td>
 </tr>
 <tr>
 <th>座椅高度</th>
 <td>17.32英寸</td>
 </tr>
 <tr>
 <th>座椅深度</th>
 <td>16.14英寸</td>
 </tr>
</table>
# 表格是以 HTML 格式呈现的,加载出来
from IPython.display import display, HTML

display(HTML(response))

本章的主要内容是 LLM 在开发应用程序中的迭代式提示开发过程。开发者需要先尝试编写提示,然后通过迭代逐步完善它,直至得到需要的结果。关键在于拥有一种有效的开发Prompt的过程,而不是知道完美的Prompt。对于一些更复杂的应用程序,可以对多个样本进行迭代开发提示并进行评估。最后,可以在更成熟的应用程序中测试多个Prompt在多个样本上的平均或最差性能。在使用 Jupyter 代码笔记本示例时,请尝试不同的变化并查看结果。

4. 文本概括 Summarizing

当今世界上有太多的文本信息,几乎没有人能够拥有足够的时间去阅读所有我们想了解的东西。但令人感到欣喜的是,目前LLM在文本概括任务上展现了强大的水准,也已经有不少团队将这项功能插入了自己的软件应用中。

本章节将介绍如何使用编程的方式,调用API接口来实现“文本概括”功能。

单一文本概括Prompt实验

这里我们举了个商品评论的例子。对于电商平台来说,网站上往往存在着海量的商品评论,这些评论反映了所有客户的想法。如果我们拥有一个工具去概括这些海量、冗长的评论,便能够快速地浏览更多评论,洞悉客户的偏好,从而指导平台与商家提供更优质的服务。

输入文本

prod_review = """
Got this panda plush toy for my daughter's birthday, \
who loves it and takes it everywhere. It's soft and \
super cute, and its face has a friendly look. It's \
a bit small for what I paid though. I think there \
might be other options that are bigger for the \
same price. It arrived a day earlier than expected, \
so I got to play with it myself before I gave it \
to her.
"""

输入文本(中文翻译)

prod_review_zh = """
这个熊猫公仔是我给女儿的生日礼物,她很喜欢,去哪都带着。
公仔很软,超级可爱,面部表情也很和善。但是相比于价钱来说,
它有点小,我感觉在别的地方用同样的价钱能买到更大的。
快递比预期提前了一天到货,所以在送给女儿之前,我自己玩了会。
"""

限制输出文本长度

我们尝试限制文本长度为最多30词。

英语例子
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site.

Summarize the review below, delimited by triple
backticks, in at most 30 words.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early.
中文例子
prompt = f"""
你的任务是从电子商务网站上生成一个产品评论的简短摘要。

请对三个反引号之间的评论文本进行概括,最多30个词汇。

评论: ```{prod_review_zh}```
"""

response = get_completion(prompt)
print(response)
可爱软熊猫公仔,女儿喜欢,面部表情和善,但价钱有点小贵,快递提前一天到货。

关键角度侧重

有时,针对不同的业务,我们对文本的侧重会有所不同。例如对于商品评论文本,物流会更关心运输时效,商家更加关心价格与商品质量,平台更关心整体服务体验。

我们可以通过增加Prompt提示,来体现对于某个特定角度的侧重。

侧重于运输

英语例子
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
Shipping deparmtment.

Summarize the review below, delimited by triple
backticks, in at most 30 words, and focusing on any aspects \
that mention shipping and delivery of the product.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
The panda plush toy arrived a day earlier than expected, but the customer felt it was a bit small for the price paid.
中文例子
prompt = f"""
你的任务是从电子商务网站上生成一个产品评论的简短摘要。

请对三个反引号之间的评论文本进行概括,最多30个词汇,并且聚焦在产品运输上。

评论: ```{prod_review_zh}```
"""

response = get_completion(prompt)
print(response)
快递提前到货,熊猫公仔软可爱,但有点小,价钱不太划算。

可以看到,输出结果以“快递提前一天到货”开头,体现了对于快递效率的侧重。

侧重于价格与质量

英语例子
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
pricing deparmtment, responsible for determining the \
price of the product.

Summarize the review below, delimited by triple
backticks, in at most 30 words, and focusing on any aspects \
that are relevant to the price and perceived value.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
The panda plush toy is soft, cute, and loved by the recipient, but the price may be too high for its size compared to other options.
中文例子
prompt = f"""
你的任务是从电子商务网站上生成一个产品评论的简短摘要。

请对三个反引号之间的评论文本进行概括,最多30个词汇,并且聚焦在产品价格和质量上。

评论: ```{prod_review_zh}```
"""

response = get_completion(prompt)
print(response)
可爱软熊猫公仔,面部表情友好,但价钱有点高,尺寸较小。快递提前一天到货。

可以看到,输出结果以“质量好、价格小贵、尺寸小”开头,体现了对于产品价格与质量的侧重。

关键信息提取

在上一节中,虽然我们通过添加关键角度侧重的Prompt,使得文本摘要更侧重于某一特定方面,但是可以发现,结果中也会保留一些其他信息,如价格与质量角度的概括中仍保留了“快递提前到货”的信息。有时这些信息是有帮助的,但如果我们只想要提取某一角度的信息,并过滤掉其他所有信息,则可以要求LLM进行“文本提取(Extract)”而非“文本概括(Summarize)”。

英语例子
prompt = f"""
Your task is to extract relevant information from \
a product review from an ecommerce site to give \
feedback to the Shipping department.

From the review below, delimited by triple quotes \
extract the information relevant to shipping and \
delivery. Limit to 30 words.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
"The product arrived a day earlier than expected."
中文例子
prompt = f"""
你的任务是从电子商务网站上的产品评论中提取相关信息。

请从以下三个反引号之间的评论文本中提取产品运输相关的信息,最多30个词汇。

评论: ```{prod_review_zh}```
"""

response = get_completion(prompt)
print(response)
快递比预期提前了一天到货。`

多条文本概括Prompt实验

在实际的工作流中,我们往往有许许多多的评论文本,以下展示了一个基于for循环调用“文本概括”工具并依次打印的示例。当然,在实际生产中,对于上百万甚至上千万的评论文本,使用for循环也是不现实的,可能需要考虑整合评论、分布式等方法提升运算效率。

英语例子
review_1 = prod_review

# review for a standing lamp
review_2 = """
Needed a nice lamp for my bedroom, and this one \
had additional storage and not too high of a price \
point. Got it fast - arrived in 2 days. The string \
to the lamp broke during the transit and the company \
happily sent over a new one. Came within a few days \
as well. It was easy to put together. Then I had a \
missing part, so I contacted their support and they \
very quickly got me the missing piece! Seems to me \
to be a great company that cares about their customers \
and products.
"""

# review for an electric toothbrush
review_3 = """
My dental hygienist recommended an electric toothbrush, \
which is why I got this. The battery life seems to be \
pretty impressive so far. After initial charging and \
leaving the charger plugged in for the first week to \
condition the battery, I've unplugged the charger and \
been using it for twice daily brushing for the last \
3 weeks all on the same charge. But the toothbrush head \
is too small. I’ve seen baby toothbrushes bigger than \
this one. I wish the head was bigger with different \
length bristles to get between teeth better because \
this one doesn’t. Overall if you can get this one \
around the $50 mark, it's a good deal. The manufactuer's \
replacements heads are pretty expensive, but you can \
get generic ones that're more reasonably priced. This \
toothbrush makes me feel like I've been to the dentist \
every day. My teeth feel sparkly clean!
"""

# review for a blender
review_4 = """
So, they still had the 17 piece system on seasonal \
sale for around $49 in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between $70-$89 for the same \
system. And the 11 piece system went up around $10 or \
so in price also from the earlier sale price of $29. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days.
"""

reviews = [review_1, review_2, review_3, review_4]
for i in range(len(reviews)):
 prompt = f"""
 Your task is to generate a short summary of a product \
 review from an ecommerce site.

 Summarize the review below, delimited by triple \
 backticks in at most 20 words.

 Review: ```{reviews[i]}```
 """
 response = get_completion(prompt)
 print(i, response, "\n")
0 Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early.

1 Affordable lamp with storage, fast shipping, and excellent customer service. Easy to assemble and missing parts were quickly replaced.

2 Good battery life, small toothbrush head, but effective cleaning. Good deal if bought around $50.

3 The product was on sale for $49 in November, but the price increased to $70-$89 in December. The base doesn't look as good as previous editions, but the reviewer plans to be gentle with it. A special tip for making smoothies is to freeze the fruits and vegetables beforehand. The motor made a funny noise after a year, and the warranty had expired. Overall quality has decreased.

5. 推断 Inferring

在这节课中,你将从产品评论和新闻文章中推断情感和主题。

这些任务可以看作是模型接收文本作为输入并执行某种分析的过程。这可能涉及提取标签、提取实体、理解文本情感等等。如果你想要从一段文本中提取正面或负面情感,在传统的机器学习工作流程中,需要收集标签数据集、训练模型、确定如何在云端部署模型并进行推断。这样做可能效果还不错,但是这个过程需要很多工作。而且对于每个任务,如情感分析、提取实体等等,都需要训练和部署单独的模型。

大型语言模型的一个非常好的特点是,对于许多这样的任务,你只需要编写一个prompt即可开始产生结果,而不需要进行大量的工作。这极大地加快了应用程序开发的速度。你还可以只使用一个模型和一个 API 来执行许多不同的任务,而不需要弄清楚如何训练和部署许多不同的模型。

商品评论文本

这是一盏台灯的评论。

英语例子
lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast. The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together. I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""
中文例子
# 中文
lamp_review_zh = """
我需要一盏漂亮的卧室灯,这款灯具有额外的储物功能,价格也不算太高。\
我很快就收到了它。在运输过程中,我们的灯绳断了,但是公司很乐意寄送了一个新的。\
几天后就收到了。这款灯很容易组装。我发现少了一个零件,于是联系了他们的客服,他们很快就给我寄来了缺失的零件!\
在我看来,Lumina 是一家非常关心顾客和产品的优秀公司!
"""

情感(正向/负向)

现在让我们来编写一个prompt来分类这个评论的情感。如果我想让系统告诉我这个评论的情感是什么,只需要编写 “以下产品评论的情感是什么” 这个prompt,加上通常的分隔符和评论文本等等。

然后让我们运行一下。结果显示这个产品评论的情感是积极的,这似乎是非常正确的。虽然这盏台灯不完美,但这个客户似乎非常满意。这似乎是一家关心客户和产品的伟大公司,可以认为积极的情感似乎是正确的答案。

英语例子
prompt = f"""
What is the sentiment of the following product review,
which is delimited with triple backticks?

Review text: ```{lamp_review}```
"""
response = get_completion(prompt)
print(response)
The sentiment of the product review is positive.
中文例子
# 中文
prompt = f"""
以下用三个反引号分隔的产品评论的情感是什么?

评论文本: ```{lamp_review_zh}```
"""
response = get_completion(prompt)
print(response)
情感是积极的/正面的。

如果你想要给出更简洁的答案,以便更容易进行后处理,可以使用上面的prompt并添加另一个指令,以一个单词 “正面” 或 “负面” 的形式给出答案。这样就只会打印出 “正面” 这个单词,这使得文本更容易接受这个输出并进行处理。

英语例子
prompt = f"""
What is the sentiment of the following product review,
which is delimited with triple backticks?

Give your answer as a single word, either "positive" \
or "negative".

Review text: ```{lamp_review}```
"""
response = get_completion(prompt)
print(response)
positive
中文例子
prompt = f"""
以下用三个反引号分隔的产品评论的情感是什么?

用一个单词回答:「正面」或「负面」。

评论文本: ```{lamp_review_zh}```
"""
response = get_completion(prompt)
print(response)
正面

识别情感类型

让我们看看另一个prompt,仍然使用台灯评论。这次我要让它识别出以下评论作者所表达的情感列表,不超过五个。

英语例子
prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: ```{lamp_review}```
"""
response = get_completion(prompt)
print(response)
satisfied, grateful, impressed, content, pleased
中文例子
# 中文
prompt = f"""
识别以下评论的作者表达的情感。包含不超过五个项目。将答案格式化为以逗号分隔的单词列表。

评论文本: ```{lamp_review_zh}```
"""
response = get_completion(prompt)
print(response)
满意,感激,信任,赞扬,愉快

大型语言模型非常擅长从一段文本中提取特定的东西。在上面的例子中,评论正在表达情感,这可能有助于了解客户如何看待特定的产品。

识别愤怒

对于很多企业来说,了解某个顾客是否非常生气很重要。所以你可能有一个类似这样的分类问题:以下评论的作者是否表达了愤怒情绪?因为如果有人真的很生气,那么可能值得额外关注,让客户支持或客户成功团队联系客户以了解情况,并为客户解决问题。

英语例子
prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: ```{lamp_review}```
"""
response = get_completion(prompt)
print(response)
No
中文例子
# 中文
prompt = f"""
以下评论的作者是否表达了愤怒?评论用三个反引号分隔。给出是或否的答案。

评论文本: ```{lamp_review_zh}```
"""
response = get_completion(prompt)
print(response)

上面这个例子中,客户并没有生气。注意,如果使用常规的监督学习,如果想要建立所有这些分类器,不可能在几分钟内就做到这一点。我们鼓励大家尝试更改一些这样的prompt,也许询问客户是否表达了喜悦,或者询问是否有任何遗漏的部分,并看看是否可以让prompt对这个灯具评论做出不同的推论。

6. 文本转换 Transforming

LLM非常擅长将输入转换成不同的格式,例如多语种文本翻译、拼写及语法纠正、语气调整、格式转换等。

本章节将介绍如何使用编程的方式,调用API接口来实现“文本转换”功能。

文本翻译

中文转西班牙语

例子
prompt = f"""
将以下中文翻译成西班牙语: \
```您好,我想订购一个搅拌机。```
"""
response = get_completion(prompt)
print(response)
Hola, me gustaría ordenar una batidora.

识别语种

例子
prompt = f"""
请告诉我以下文本是什么语种:
```Combien coûte le lampadaire?```
"""
response = get_completion(prompt)
print(response)
这是法语。

多语种翻译

例子
prompt = f"""
请将以下文本分别翻译成中文、英文、法语和西班牙语:
```I want to order a basketball.```
"""
response = get_completion(prompt)
print(response)
中文:我想订购一个篮球。
英文:I want to order a basketball.
法语:Je veux commander un ballon de basket.
西班牙语:Quiero pedir una pelota de baloncesto.

翻译+正式语气

例子
prompt = f"""
请将以下文本翻译成中文,分别展示成正式与非正式两种语气:
```Would you like to order a pillow?```
"""
response = get_completion(prompt)
print(response)
正式语气:请问您需要订购枕头吗?
非正式语气:你要不要订一个枕头?

通用翻译器

随着全球化与跨境商务的发展,交流的用户可能来自各个不同的国家,使用不同的语言,因此我们需要一个通用翻译器,识别各个消息的语种,并翻译成目标用户的母语,从而实现更方便的跨国交流。

例子
user_messages = [
 "La performance du système est plus lente que d'habitude.", # System performance is slower than normal 
 "Mi monitor tiene píxeles que no se iluminan.", # My monitor has pixels that are not lighting
 "Il mio mouse non funziona", # My mouse is not working
 "Mój klawisz Ctrl jest zepsuty", # My keyboard has a broken control key
 "我的屏幕在闪烁" # My screen is flashing
]

for issue in user_messages:
 prompt = f"告诉我以下文本是什么语种,直接输出语种,如法语,无需输出标点符号: ```{issue}```"
 lang = get_completion(prompt)
 print(f"原始消息 ({lang}): {issue}\n")

 prompt = f"""
 将以下消息分别翻译成英文和中文,并写成
 中文翻译:xxx
 英文翻译:yyy
 的格式:
 ```{issue}```
 """
 response = get_completion(prompt)
 print(response, "\n=========================================")
原始消息 (法语): La performance du système est plus lente que d'habitude.

中文翻译:系统性能比平时慢。
英文翻译:The system performance is slower than usual.
=========================================
原始消息 (西班牙语): Mi monitor tiene píxeles que no se iluminan.

中文翻译:我的显示器有一些像素点不亮。
英文翻译:My monitor has pixels that don't light up.
=========================================
原始消息 (意大利语): Il mio mouse non funziona

中文翻译:我的鼠标不工作了。
英文翻译:My mouse is not working.
=========================================
原始消息 (波兰语): Mój klawisz Ctrl jest zepsuty

中文翻译:我的Ctrl键坏了
英文翻译:My Ctrl key is broken.
=========================================
原始消息 (中文): 我的屏幕在闪烁

中文翻译:我的屏幕在闪烁。
英文翻译:My screen is flickering.
=========================================

语气/风格调整

写作的语气往往会根据受众对象而有所调整。例如,对于工作邮件,我们常常需要使用正式语气与书面用词,而对同龄朋友的微信聊天,可能更多地会使用轻松、口语化的语气。

例子
prompt = f"""
将以下文本翻译成商务信函的格式:
```小老弟,我小羊,上回你说咱部门要采购的显示器是多少寸来着?```
"""
response = get_completion(prompt)
print(response)
尊敬的XXX(收件人姓名):

您好!我是XXX(发件人姓名),在此向您咨询一个问题。上次我们交流时,您提到我们部门需要采购显示器,但我忘记了您所需的尺寸是多少英寸。希望您能够回复我,以便我们能够及时采购所需的设备。

谢谢您的帮助!

此致

敬礼

XXX(发件人姓名)

格式转换

ChatGPT非常擅长不同格式之间的转换,例如JSON到HTML、XML、Markdown等。在下述例子中,我们有一个包含餐厅员工姓名和电子邮件的列表的JSON,我们希望将其从JSON转换为HTML。

例子
data_json = { "resturant employees" :[
 {"name":"Shyam", "email":"shyamjaiswal@gmail.com"},
 {"name":"Bob", "email":"bob32@gmail.com"},
 {"name":"Jai", "email":"jai87@gmail.com"}
]}
prompt = f"""
将以下Python字典从JSON转换为HTML表格,保留表格标题和列名:{data_json}
"""
response = get_completion(prompt)
print(response)
<table>
 <caption>resturant employees</caption>
 <thead>
 <tr>
 <th>name</th>
 <th>email</th>
 </tr>
 </thead>
 <tbody>
 <tr>
 <td>Shyam</td>
 <td>shyamjaiswal@gmail.com</td>
 </tr>
 <tr>
 <td>Bob</td>
 <td>bob32@gmail.com</td>
 </tr>
 <tr>
 <td>Jai</td>
 <td>jai87@gmail.com</td>
 </tr>
 </tbody>
</table>
from IPython.display import display, Markdown, Latex, HTML, JSON
display(HTML(response))

拼写及语法纠正

拼写及语法的检查与纠正是一个十分常见的需求,特别是使用非母语语言,例如发表英文论文时,这是一件十分重要的事情。

以下给了一个例子,有一个句子列表,其中有些句子存在拼写或语法问题,有些则没有,我们循环遍历每个句子,要求模型校对文本,如果正确则输出“未发现错误”,如果错误则输出纠正后的文本。

例子
text = [
 "The girl with the black and white puppies have a ball.", # The girl has a ball.
 "Yolanda has her notebook.", # ok
 "Its going to be a long day. Does the car need it’s oil changed?", # Homonyms
 "Their goes my freedom. There going to bring they’re suitcases.", # Homonyms
 "Your going to need you’re notebook.", # Homonyms
 "That medicine effects my ability to sleep. Have you heard of the butterfly affect?", # Homonyms
 "This phrase is to cherck chatGPT for spelling abilitty" # spelling
]

for i in range(len(text)):
 prompt = f"""请校对并更正以下文本,注意纠正文本保持原始语种,无需输出原始文本。
 如果您没有发现任何错误,请说“未发现错误”。

 例如:
 输入:I are happy.
 输出:I am happy.
 ```{text[i]}```"""
 response = get_completion(prompt)
 print(i, response)
0 The girl with the black and white puppies has a ball.
1 未发现错误。
2 It's going to be a long day. Does the car need its oil changed?
3 Their goes my freedom. They're going to bring their suitcases.
4 输出:You're going to need your notebook.
5 That medicine affects my ability to sleep. Have you heard of the butterfly effect?
6 This phrase is to check chatGPT for spelling ability.

以下是一个简单的类Grammarly纠错示例,输入原始文本,输出纠正后的文本,并基于Redlines输出纠错过程。

例子
text = f"""
Got this for my daughter for her birthday cuz she keeps taking \
mine from my room. Yes, adults also like pandas too. She takes \
it everywhere with her, and it's super soft and cute. One of the \
ears is a bit lower than the other, and I don't think that was \
designed to be asymmetrical. It's a bit small for what I paid for it \
though. I think there might be other options that are bigger for \
the same price. It arrived a day earlier than expected, so I got \
to play with it myself before I gave it to my daughter.
"""

prompt = f"校对并更正以下商品评论:```{text}```"
response = get_completion(prompt)
print(response)
I got this for my daughter's birthday because she keeps taking mine from my room. Yes, adults also like pandas too. She takes it everywhere with her, and it's super soft and cute. However, one of the ears is a bit lower than the other, and I don't think that was designed to be asymmetrical. It's also a bit smaller than I expected for the price. I think there might be other options that are bigger for the same price. On the bright side, it arrived a day earlier than expected, so I got to play with it myself before giving it to my daughter.
response = """
I got this for my daughter's birthday because she keeps taking mine from my room. Yes, adults also like pandas too. She takes it everywhere with her, and it's super soft and cute. However, one of the ears is a bit lower than the other, and I don't think that was designed to be asymmetrical. It's also a bit smaller than I expected for the price. I think there might be other options that are bigger for the same price. On the bright side, it arrived a day earlier than expected, so I got to play with it myself before giving it to my daughter.
"""
# 如未安装redlines,需先安装
>> pip3.8 install redlines
from redlines import Redlines
from IPython.display import display, Markdown

diff = Redlines(text,response)
display(Markdown(diff.output_markdown))

一个综合样例:文本翻译+拼写纠正+风格调整+格式转换

例子
text = f"""
Got this for my daughter for her birthday cuz she keeps taking \
mine from my room. Yes, adults also like pandas too. She takes \
it everywhere with her, and it's super soft and cute. One of the \
ears is a bit lower than the other, and I don't think that was \
designed to be asymmetrical. It's a bit small for what I paid for it \
though. I think there might be other options that are bigger for \
the same price. It arrived a day earlier than expected, so I got \
to play with it myself before I gave it to my daughter.
"""
prompt = f"""
针对以下三个反引号之间的英文评论文本,
首先进行拼写及语法纠错,
然后将其转化成中文,
再将其转化成优质淘宝评论的风格,从各种角度出发,分别说明产品的优点与缺点,并进行总结。
润色一下描述,使评论更具有吸引力。
输出结果格式为:
【优点】xxx
【缺点】xxx
【总结】xxx
注意,只需填写xxx部分,并分段输出。
将结果输出成Markdown格式。
```{text}```
"""
response = get_completion(prompt)
display(Markdown(response))

7. 文本扩展 Expanding

扩展是将短文本,例如一组说明或主题列表,输入到大型语言模型中,让模型生成更长的文本,例如基于某个主题的电子邮件或论文。这样做有一些很好的用途,例如将大型语言模型用作头脑风暴的伙伴。但这种做法也存在一些问题,例如某人可能会使用它来生成大量垃圾邮件。因此,当你使用大型语言模型的这些功能时,请仅以负责任的方式和有益于人们的方式使用它们。

在本章中,你将学会如何基于 OpenAI API 生成适用于每个客户评价的客户服务电子邮件。我们还将使用模型的另一个输入参数称为温度,这种参数允许您在模型响应中变化探索的程度和多样性。

定制客户邮件

我们将根据客户评价和情感撰写自定义电子邮件响应。因此,我们将给定客户评价和情感,并生成自定义响应即使用 LLM 根据客户评价和评论情感生成定制电子邮件。

我们首先给出一个示例,包括一个评论及对应的情感

英语例子
# given the sentiment from the lesson on "inferring",
# and the original customer message, customize the email
sentiment = "negative"

# review for a blender
review = f"""
So, they still had the 17 piece system on seasonal \
sale for around $49 in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between $70-$89 for the same \
system. And the 11 piece system went up around $10 or \
so in price also from the earlier sale price of $29. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days.
"""
中文例子
# 我们可以在推理那章学习到如何对一个评论判断其情感倾向
sentiment = "negative"


# 一个产品的评价
review = f"""
他们在11月份的季节性销售期间以约49美元的价格出售17件套装,折扣约为一半。\
但由于某些原因(可能是价格欺诈),到了12月第二周,同样的套装价格全都涨到了70美元到89美元不等。\
11件套装的价格也上涨了大约10美元左右。\
虽然外观看起来还可以,但基座上锁定刀片的部分看起来不如几年前的早期版本那么好。\
不过我打算非常温柔地使用它,例如,\
我会先在搅拌机中将像豆子、冰、米饭等硬物研磨,然后再制成所需的份量,\
切换到打蛋器制作更细的面粉,或者在制作冰沙时先使用交叉切割刀片,然后使用平面刀片制作更细/不粘的效果。\
制作冰沙时,特别提示:\
将水果和蔬菜切碎并冷冻(如果使用菠菜,则轻轻煮软菠菜,然后冷冻直到使用;\
如果制作果酱,则使用小到中号的食品处理器),这样可以避免在制作冰沙时添加太多冰块。\
大约一年后,电机发出奇怪的噪音,我打电话给客服,但保修已经过期了,所以我不得不再买一个。\
总的来说,这些产品的总体质量已经下降,因此它们依靠品牌认可和消费者忠诚度来维持销售。\
货物在两天内到达。
"""

我们已经使用推断课程中学到的提取了情感,这是一个关于搅拌机的客户评价,现在我们将根据情感定制回复。

这里的指令是:假设你是一个客户服务AI助手,你的任务是为客户发送电子邮件回复,根据通过三个反引号分隔的客户电子邮件,生成一封回复以感谢客户的评价。

英语例子
prompt = f"""
You are a customer service AI assistant.
Your task is to send an email reply to a valued customer.
Given the customer email delimited by ```, \
Generate a reply to thank the customer for their review.
If the sentiment is positive or neutral, thank them for \
their review.
If the sentiment is negative, apologize and suggest that \
they can reach out to customer service.
Make sure to use specific details from the review.
Write in a concise and professional tone.
Sign the email as `AI customer agent`.
Customer review: ```{review}```
Review sentiment: {sentiment}
"""
response = get_completion(prompt)
print(response)
Dear Valued Customer,

Thank you for taking the time to leave a review about our product. We are sorry to hear that you experienced an increase in price and that the quality of the product did not meet your expectations. We apologize for any inconvenience this may have caused you.

We would like to assure you that we take all feedback seriously and we will be sure to pass your comments along to our team. If you have any further concerns, please do not hesitate to reach out to our customer service team for assistance.

Thank you again for your review and for choosing our product. We hope to have the opportunity to serve you better in the future.

Best regards,

AI customer agent
中文例子
prompt = f"""
你是一位客户服务的AI助手。
你的任务是给一位重要客户发送邮件回复。
根据客户通过“```”分隔的评价,生成回复以感谢客户的评价。提醒模型使用评价中的具体细节
用简明而专业的语气写信。
作为“AI客户代理”签署电子邮件。
客户评论:
```{review}```
评论情感:{sentiment}
"""
response = get_completion(prompt)
print(response)
尊敬的客户,

非常感谢您对我们产品的评价。我们非常抱歉您在购买过程中遇到了价格上涨的问题。我们一直致力于为客户提供最优惠的价格,但由于市场波动,价格可能会有所变化。我们深表歉意,如果您需要任何帮助,请随时联系我们的客户服务团队。

我们非常感谢您对我们产品的详细评价和使用技巧。我们将会把您的反馈传达给我们的产品团队,以便改进我们的产品质量和性能。

再次感谢您对我们的支持和反馈。如果您需要任何帮助或有任何疑问,请随时联系我们的客户服务团队。

祝您一切顺利!

AI客户代理

使用温度系数

接下来,我们将使用语言模型的一个称为“温度”的参数,它将允许我们改变模型响应的多样性。您可以将温度视为模型探索或随机性的程度。

例如,在一个特定的短语中,“我的最爱食品”最有可能的下一个词是“比萨”,其次最有可能的是“寿司”和“塔可”。因此,在温度为零时,模型将总是选择最有可能的下一个词,而在较高的温度下,它还将选择其中一个不太可能的词,在更高的温度下,它甚至可能选择塔可,而这种可能性仅为五分之一。您可以想象,随着模型继续生成更多单词的最终响应,“我的最爱食品是比萨”将会与第一个响应“我的最爱食品是塔可”产生差异。因此,随着模型的继续,这两个响应将变得越来越不同。

一般来说,在构建需要可预测响应的应用程序时,我建议使用温度为零。在所有课程中,我们一直设置温度为零,如果您正在尝试构建一个可靠和可预测的系统,我认为您应该选择这个温度。如果您尝试以更具创意的方式使用模型,可能需要更广泛地输出不同的结果,那么您可能需要使用更高的温度。

英语例子
# given the sentiment from the lesson on "inferring",
# and the original customer message, customize the email
sentiment = "negative"

# review for a blender
review = f"""
So, they still had the 17 piece system on seasonal \
sale for around $49 in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between $70-$89 for the same \
system. And the 11 piece system went up around $10 or \
so in price also from the earlier sale price of $29. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days.
"""
prompt = f"""
You are a customer service AI assistant.
Your task is to send an email reply to a valued customer.
Given the customer email delimited by ```, \
Generate a reply to thank the customer for their review.
If the sentiment is positive or neutral, thank them for \
their review.
If the sentiment is negative, apologize and suggest that \
they can reach out to customer service.
Make sure to use specific details from the review.
Write in a concise and professional tone.
Sign the email as `AI customer agent`.
Customer review: ```{review}```
Review sentiment: {sentiment}
"""
response = get_completion(prompt, temperature=0.7)
print(response)
Dear valued customer,

Thank you for taking the time to share your review with us. We are sorry to hear that you were disappointed with the prices of our products and the quality of our blender. We apologize for any inconvenience this may have caused you.

We value your feedback and would like to make things right for you. Please feel free to contact our customer service team so we can assist you with any concerns or issues you may have. We are committed to providing you with the best possible service and products.

Thank you again for your review and for being a loyal customer. We hope to have the opportunity to serve you better in the future.

Sincerely,
AI customer agent
中文例子
prompt = f"""
你是一名客户服务的AI助手。
你的任务是给一位重要的客户发送邮件回复。
根据通过“```”分隔的客户电子邮件生成回复,以感谢客户的评价。
如果情感是积极的或中性的,感谢他们的评价。
如果情感是消极的,道歉并建议他们联系客户服务。
请确保使用评论中的具体细节。
以简明和专业的语气写信。
以“AI客户代理”的名义签署电子邮件。
客户评价:```{review}```
评论情感:{sentiment}
"""
response = get_completion(prompt, temperature=0.7)
print(response)
尊敬的客户,

非常感谢您对我们产品的评价。我们由衷地为您在购买过程中遇到的问题表示抱歉。我们确实在12月份的第二周调整了价格,但这是由于市场因素所致,并非价格欺诈。我们深刻意识到您对产品质量的担忧,我们将尽一切努力改进产品,以提供更好的体验。

我们非常感激您对我们产品的使用经验和制作技巧的分享。您的建议和反馈对我们非常重要,我们将以此为基础,进一步改进我们的产品。

如果您有任何疑问或需要进一步帮助,请随时联系我们的客户服务部门。我们将尽快回复您并提供帮助。

最后,请再次感谢您对我们产品的评价和选择。我们期待着未来与您的合作。

此致

敬礼

AI客户代理

在温度为零时,每次执行相同的提示时,您应该期望获得相同的完成。而使用温度为0.7,则每次都会获得不同的输出。

所以,您可以看到它与我们之前收到的电子邮件不同。让我们再次执行它,以显示我们将再次获得不同的电子邮件。

因此,我建议您自己尝试温度,以查看输出如何变化。总之,在更高的温度下,模型的输出更加随机。您几乎可以将其视为在更高的温度下,助手更易分心,但也许更有创造力。

7. 对话聊天

使用一个大型语言模型的一个令人兴奋的事情是,我们可以用它来构建一个定制的聊天机器人,只需要很少的工作量。在这一节中,我们将探索如何利用聊天格式(接口)与个性化或专门针对特定任务或行为的聊天机器人进行延伸对话。

像 ChatGPT 这样的聊天模型实际上是组装成以一系列消息作为输入,并返回一个模型生成的消息作为输出的。虽然聊天格式的设计旨在使这种多轮对话变得容易,但我们通过之前的学习可以知道,它对于没有任何对话的单轮任务也同样有用。

接下来,我们将定义两个辅助函数。第一个是单轮的,我们将prompt放入看起来像是某种用户消息的东西中。另一个则传入一个消息列表。这些消息可以来自不同的角色,我们会描述一下这些角色。

第一条消息是一个系统消息,它提供了一个总体的指示,然后在这个消息之后,我们有用户和助手之间的交替。如果你曾经使用过 ChatGPT 网页界面,那么你的消息是用户消息,而 ChatGPT 的消息是助手消息。系统消息则有助于设置助手的行为和角色,并作为对话的高级指示。你可以想象它在助手的耳边低语,引导它的回应,而用户不会注意到系统消息。

因此,作为用户,如果你曾经使用过 ChatGPT,你可能不知道 ChatGPT 的系统消息是什么,这是有意为之的。系统消息的好处是为开发者提供了一种方法,在不让请求本身成为对话的一部分的情况下,引导助手并指导其回应。

def get_completion(prompt, model="gpt-3.5-turbo"):
 messages = [{"role": "user", "content": prompt}]
 response = openai.ChatCompletion.create(
 model=model,
 messages=messages,
 temperature=0, # 控制模型输出的随机程度
 )
 return response.choices[0].message["content"]

def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0):
 response = openai.ChatCompletion.create(
 model=model,
 messages=messages,
 temperature=temperature, # 控制模型输出的随机程度
 )
# print(str(response.choices[0].message))
 return response.choices[0].message["content"]

现在让我们尝试在对话中使用这些消息。我们将使用上面的函数来获取从这些消息中得到的回答,同时,使用更高的 temperature(越高生成的越多样)。

系统消息说,你是一个说话像莎士比亚的助手。这是我们向助手描述它应该如何表现的方式。然后,第一个用户消息是,给我讲个笑话。接下来的消息是,为什么鸡会过马路?然后最后一个用户消息是,我不知道。

英语例子
messages = [
{'role':'system', 'content':'You are an assistant that speaks like Shakespeare.'},
{'role':'user', 'content':'tell me a joke'},
{'role':'assistant', 'content':'Why did the chicken cross the road'},
{'role':'user', 'content':'I don\'t know'} ]

response = get_completion_from_messages(messages, temperature=1)
print(response)
To get to the other side, fair sir.
中文例子
# 中文
messages = [
{'role':'system', 'content':'你是一个像莎士比亚一样说话的助手。'},
{'role':'user', 'content':'给我讲个笑话'},
{'role':'assistant', 'content':'鸡为什么过马路'},
{'role':'user', 'content':'我不知道'} ]
response = get_completion_from_messages(messages, temperature=1)
print(response)
因为它要去找“母鸡”。哈哈哈!(注:此为英文双关语,"chicken"是鸡的意思,也是胆小的意思;"cross the road"是过马路的意思,也是“破坏规则”的意思。)

让我们做另一个例子。助手的消息是,你是一个友好的聊天机器人,第一个用户消息是,嗨,我叫Isa。我们想要得到第一个用户消息。

英语例子
messages = [
{'role':'system', 'content':'You are friendly chatbot.'},
{'role':'user', 'content':'Hi, my name is Isa'} ]
response = get_completion_from_messages(messages, temperature=1)
print(response)
Hello Isa! It's great to meet you. How can I assist you today?
中文例子
# 中文
messages = [
{'role':'system', 'content':'你是个友好的聊天机器人。'},
{'role':'user', 'content':'Hi, 我是Isa。'} ]
response = get_completion_from_messages(messages, temperature=1)
print(response)
嗨,Isa,很高兴见到你!有什么我可以帮助你的吗?

让我们再试一个例子。系统消息是,你是一个友好的聊天机器人,第一个用户消息是,是的,你能提醒我我的名字是什么吗?

英语例子
messages = [
{'role':'system', 'content':'You are friendly chatbot.'},
{'role':'user', 'content':'Yes, can you remind me, What is my name?'} ]
response = get_completion_from_messages(messages, temperature=1)
print(response)
I'm sorry, but since we don't have any personal information about you, I don't know your name. Can you please tell me your name?
中文例子
# 中文
messages = [
{'role':'system', 'content':'你是个友好的聊天机器人。'},
{'role':'user', 'content':'好,你能提醒我,我的名字是什么吗?'} ]
response = get_completion_from_messages(messages, temperature=1)
print(response)
抱歉,我不知道您的名字,因为我们是虚拟的聊天机器人和现实生活中的人类在不同的世界中。

如上所见,模型实际上并不知道我的名字。

因此,每次与语言模型的交互都是一个独立的交互,这意味着我们必须提供所有相关的消息,以便模型在当前对话中进行引用。如果想让模型引用或 “记住” 对话的早期部分,则必须在模型的输入中提供早期的交流。我们将其称为上下文。让我们试试。

英语例子
messages = [
{'role':'system', 'content':'You are friendly chatbot.'},
{'role':'user', 'content':'Hi, my name is Isa'},
{'role':'assistant', 'content': "Hi Isa! It's nice to meet you. \
Is there anything I can help you with today?"},
{'role':'user', 'content':'Yes, you can remind me, What is my name?'} ]
response = get_completion_from_messages(messages, temperature=1)
print(response)
Your name is Isa.
中文例子
# 中文
messages = [
{'role':'system', 'content':'你是个友好的聊天机器人。'},
{'role':'user', 'content':'Hi, 我是Isa'},
{'role':'assistant', 'content': "Hi Isa! 很高兴认识你。今天有什么可以帮到你的吗?"},
{'role':'user', 'content':'是的,你可以提醒我, 我的名字是什么?'} ]
response = get_completion_from_messages(messages, temperature=1)
print(response)
当然可以!您的名字是Isa。

现在我们已经给模型提供了上下文,也就是之前的对话中提到的我的名字,然后我们会问同样的问题,也就是我的名字是什么。因为模型有了需要的全部上下文,所以它能够做出回应,就像我们在输入的消息列表中看到的一样。

订餐机器人

现在,我们构建一个 “订餐机器人”,我们需要它自动收集用户信息,接受比萨饼店的订单。

下面这个函数将收集我们的用户消息,以便我们可以避免手动输入,就像我们在刚刚上面做的那样。这个函数将从我们下面构建的用户界面中收集提示,然后将其附加到一个名为上下文的列表中,并在每次调用模型时使用该上下文。模型的响应也会被添加到上下文中,所以模型消息和用户消息都被添加到上下文中,因此上下文逐渐变长。这样,模型就有了需要的信息来确定下一步要做什么。

def collect_messages(_):
 prompt = inp.value_input
 inp.value = ''
 context.append({'role':'user', 'content':f"{prompt}"})
 response = get_completion_from_messages(context)
 context.append({'role':'assistant', 'content':f"{response}"})
 panels.append(
 pn.Row('User:', pn.pane.Markdown(prompt, width=600)))
 panels.append(
 pn.Row('Assistant:', pn.pane.Markdown(response, width=600, style={'background-color': '#F6F6F6'})))

 return pn.Column(*panels)

现在,我们将设置并运行这个 UI 来显示订单机器人。初始的上下文包含了包含菜单的系统消息。请注意,上下文会随着时间的推移而不断增长。

>> pip install panel
英语例子
import panel as pn # GUI
pn.extension()

panels = [] # collect display 

context = [ {'role':'system', 'content':"""
You are OrderBot, an automated service to collect orders for a pizza restaurant. \
You first greet the customer, then collects the order, \
and then asks if it's a pickup or delivery. \
You wait to collect the entire order, then summarize it and check for a final \
time if the customer wants to add anything else. \
If it's a delivery, you ask for an address. \
Finally you collect the payment.\
Make sure to clarify all options, extras and sizes to uniquely \
identify the item from the menu.\
You respond in a short, very conversational friendly style. \
The menu includes \
pepperoni pizza 12.95, 10.00, 7.00 \
cheese pizza 10.95, 9.25, 6.50 \
eggplant pizza 11.95, 9.75, 6.75 \
fries 4.50, 3.50 \
greek salad 7.25 \
Toppings: \
extra cheese 2.00, \
mushrooms 1.50 \
sausage 3.00 \
canadian bacon 3.50 \
AI sauce 1.50 \
peppers 1.00 \
Drinks: \
coke 3.00, 2.00, 1.00 \
sprite 3.00, 2.00, 1.00 \
bottled water 5.00 \
"""} ] # accumulate messages


inp = pn.widgets.TextInput(value="Hi", placeholder='Enter text here…')
button_conversation = pn.widgets.Button(name="Chat!")

interactive_conversation = pn.bind(collect_messages, button_conversation)

dashboard = pn.Column(
 inp,
 pn.Row(button_conversation),
 pn.panel(interactive_conversation, loading_indicator=True, height=300),
)
dashboard

现在我们可以要求模型创建一个 JSON 摘要发送给订单系统。

所以我们现在追加另一个系统消息,它是另一条prompt,我们说创建一个刚刚订单的 JSON 摘要,列出每个项目的价格,字段应包括1)披萨,包括尺寸,2)配料列表,3)饮料列表,4)辅菜列表,包括尺寸,最后是总价格。这里也可以在这里使用用户消息,不一定是系统消息。

请注意,这里我们使用了一个较低的temperature,因为对于这些类型的任务,我们希望输出相对可预测。

英语例子
messages = context.copy()
messages.append(
{'role':'system', 'content':'create a json summary of the previous food order. Itemize the price for each item\
 The fields should be 1) pizza, include size 2) list of toppings 3) list of drinks, include size 4) list of sides include size 5)total price '},
)
 #The fields should be 1) pizza, price 2) list of toppings 3) list of drinks, include size include price 4) list of sides include size include price, 5)total price '}, 

response = get_completion_from_messages(messages, temperature=0)
print(response)

中文例子
# 中文
import panel as pn # GUI
pn.extension()

panels = [] # collect display 

context = [{'role':'system', 'content':"""
你是订餐机器人,为披萨餐厅自动收集订单信息。
你要首先问候顾客。然后等待用户回复收集订单信息。收集完信息需确认顾客是否还需要添加其他内容。
最后需要询问是否自取或外送,如果是外送,你要询问地址。
最后告诉顾客订单总金额,并送上祝福。

请确保明确所有选项、附加项和尺寸,以便从菜单中识别出该项唯一的内容。
你的回应应该以简短、非常随意和友好的风格呈现。

菜单包括:

菜品:
意式辣香肠披萨(大、中、小) 12.95、10.00、7.00
芝士披萨(大、中、小) 10.95、9.25、6.50
茄子披萨(大、中、小) 11.95、9.75、6.75
薯条(大、小) 4.50、3.50
希腊沙拉 7.25

配料:
奶酪 2.00
蘑菇 1.50
香肠 3.00
加拿大熏肉 3.50
AI酱 1.50
辣椒 1.00

饮料:
可乐(大、中、小) 3.00、2.00、1.00
雪碧(大、中、小) 3.00、2.00、1.00
瓶装水 5.00
"""} ] # accumulate messages


inp = pn.widgets.TextInput(value="Hi", placeholder='Enter text here…')
button_conversation = pn.widgets.Button(name="Chat!")

interactive_conversation = pn.bind(collect_messages, button_conversation)

dashboard = pn.Column(
 inp,
 pn.Row(button_conversation),
 pn.panel(interactive_conversation, loading_indicator=True, height=300),
)
dashboard
messages = context.copy()
messages.append(
{'role':'system', 'content':'创建上一个食品订单的 json 摘要。\
逐项列出每件商品的价格,字段应该是 1) 披萨,包括大小 2) 配料列表 3) 饮料列表,包括大小 4) 配菜列表包括大小 5) 总价'},
)

response = get_completion_from_messages(messages, temperature=0)
print(response)

现在,我们已经建立了自己的订餐聊天机器人。请随意自定义并修改系统消息,以更改聊天机器人的行为,并使其扮演不同的角色和拥有不同的知识。

8. 总结

恭喜你完成了这门短期课程。

总的来说,在这门课程中,我们学习了关于prompt的两个关键原则:

  • 编写清晰具体的指令;
  • 如果适当的话,给模型一些思考时间。

你还学习了迭代式prompt开发的方法,并了解了如何找到适合你应用程序的prompt的过程是非常关键的。

我们还介绍了许多大型语言模型的功能,包括摘要、推断、转换和扩展。你还学会了如何构建自定义聊天机器人。在这门短期课程中,你学到了很多,希望你喜欢这些学习材料。

我们希望你能想出一些应用程序的想法,并尝试自己构建它们。请尝试一下并让我们知道你的想法。你可以从一个非常小的项目开始,也许它具有一定的实用价值,也可能完全没有实用价值,只是一些有趣好玩儿的东西。请利用你第一个项目的学习经验来构建更好的第二个项目,甚至更好的第三个项目等。或者,如果你已经有一个更大的项目想法,那就去做吧。

大型语言模型非常强大,作为提醒,我们希望大家负责任地使用它们,请仅构建对他人有积极影响的东西。在这个时代,构建人工智能系统的人可以对他人产生巨大的影响。因此必须负责任地使用这些工具。

现在,基于大型语言模型构建应用程序是一个非常令人兴奋和不断发展的领域。现在你已经完成了这门课程,我们认为你现在拥有了丰富的知识,可以帮助你构建其他人今天不知道如何构建的东西。因此,我希望你也能帮助我们传播并鼓励其他人也参加这门课程。

最后,希望你在完成这门课程时感到愉快,感谢你完成了这门课程。我们期待听到你构建的惊人之作。

Git: submodule 子模块简明教程

2021年8月28日 08:00

有种情况我们经常会遇到:某个工作中的项目需要包含并使用另一个项目。 也许是第三方库,或者你独立开发的,用于多个父项目的库。 现在问题来了:你想要把它们当做两个独立的项目,同时又想在一个项目中使用另一个。

Git 通过子模块来解决这个问题。 子模块允许你将一个 Git 仓库作为另一个 Git 仓库的子目录。 它能让你将另一个仓库克隆到自己的项目中,同时还保持提交的独立。


添加子模块

添加一个远程仓库项目 https://github.com/iphysresearch/GWToolkit.git 子模块到一个已有主仓库项目中。代码形式是 git submodule add <url> <repo_name>, 如下面的例子:

$ git submodule add https://github.com/iphysresearch/GWToolkit.git GWToolkit

这时,你会看到一个名为 GWToolkit 的文件夹在你的主仓库目录中。

如果你是旧版 Git 的话,你会发现 ./GWToolkit 目录中是空的,你还需要在执行一步「更新子模块」,才可以把远程仓库项目中的内容下载下来。

$ git submodule update --init --recursive

如果你不小心把路径写错了,可以用下面的代码来删掉,详细可查阅 git help submodule

$ git rm --cached GWToolkit

添加子模块后,若运行 git status,可以看到主仓库目录中会增加一个文件 .gitmodules,这个文件用来保存子模块的信息。

$ git status
位于分支 main
您的分支与上游分支 'origin/main' 一致。

要提交的变更:
 (使用 "git restore --staged <文件>..." 以取消暂存)
 新文件: .gitmodules
 新文件: GWToolkit

另外,在 .git/config 中会多出一块关于子模块信息的内容:

[submodule "GWToolkit"]
 url = https://github.com/iphysresearch/GWToolkit.git
 active = true

该配置文件保存了项目 URL 与已经拉取的本地目录之间的映射。如果有多个子模块,该文件中就会有多条记录。 要重点注意的是,该文件也像 .gitignore 文件一样受到(通过)版本控制。 它会和该项目的其他部分一同被拉取推送。 这就是克隆该项目的人知道去哪获得子模块的原因。

新生成的还有相关子模块的文件:.git/modules/GWToolkit/

此时若把上述「添加子模块」的修改更新到主仓库的 GitHub 上去的话,会看到相应子模块仓库的文件夹图标会有些不同:

此时还要留意的是,在终端 Git 命令操作下,位于主仓库目录中除了子模块外的任何子目录下进行的 commit 操作,都会记到主仓库下。只有在子模块目录内的任何 commit 操作,才会记到子模块仓库下。如下面的示例:

$ cd ~/projects/<module>
$ git log # log shows commits from Project <module>
$ cd ~/projects/<module>/<sub_dir>
$ git log # still commits from Project <module>
$ cd ~/projects/<module>/<submodule>
$ git log # commits from <submodule>

查看子模块

$ git submodule
 13fe233bb134e25382693905cfb982fe58fa94c9 GWToolkit (heads/main)

更新子模块

更新项目内子模块到最新版本:

$ git submodule update

更新子模块为远程项目的最新版本

$ git submodule update --remote

Clone 包含子模块的项目

对于你的主仓库项目合作者来说,如果只是 git clone 去下载主仓库的内容,那么你会发现子模块仓库的文件夹内是空的!

此时,你可以像上面「添加子模块」中说到的使用 git submodule update --init --recursive 来递归的初始化并下载子模块仓库的内容。

也可以分初始化和更新子模块两步走的方式来下载子模块仓库的内容:

$ git submodule init # 初始化子模块
$ git submodule update # 更新子模块

但是,如果你是第一次使用 git clone 下载主仓库的所有项目内容的话,我建议你可以使用如下的代码格式来把主仓库和其中子模块的所有内容,都一步到位的下载下来:

$ git clone --recursive <project url>

以后可以在子模块仓库目录下使用 git pull origin main 或者 git push 等来进行更新与合并等操作。


删除子模块

删除子模块比较麻烦,需要手动删除相关的文件,否则在添加子模块时有可能出现错误 同样以删除 GWToolkit 子模块仓库文件夹为例:

  1. 删除子模块文件夹

    $ git rm --cached GWToolkit
    $ rm -rf GWToolkit
    
  2. 删除 .gitmodules 文件中相关子模块的信息,类似于:

    [submodule "GWToolkit"]
     path = GWToolkit
     url = https://github.com/iphysresearch/GWToolkit.git
    
  3. 删除 .git/config 中相关子模块信息,类似于:

    [submodule "GWToolkit"]
     url = https://github.com/iphysresearch/GWToolkit.git
     active = true
    
  4. 删除 .git 文件夹中的相关子模块文件

    $ rm -rf .git/modules/GWToolkit
    

最后的话

  • 虽然 Git 提供的子模块功能已足够方便好用,但仍请在为主仓库项目添加子模块之前确保这是非常必要的。毕竟有很多编程语言(如 Go)或其他依赖管理工具(如 Ruby’s rubygems, Node.js’ npm, or Cocoa’s CocoaPods and Carthage)可以更好的 handle 类似的功能。
  • 主仓库项目的合作者并不会自动地看到子模块仓库的更新通知的。所以,更新子模块后一定要记得提醒一下主仓库项目的合作者 git submodule update

参考资料

GitHub 不再支持密码验证,如何在 macOS 上实现 Token 登陆配置

2021年8月15日 08:00

这两天我发现用 GitHub 的时候,push 不了代码了,不停地出如下的问题:

remote: Support for password authentication was removed on August 13, 2021. Please use a personal access token instead.
remote: Please see https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/ for more information.
fatal: 无法访问 'https://github.com/iphysresearch/GWToolkit.git/':The requested URL returned error: 403

其实官方早在去年年底的官方博文就给出了上面链接的说明,同时最近的官方博文中也有着详细的解释:

As previously announced, starting on August 13, 2021, at 09:00 PST, we will no longer accept account passwords when authenticating Git operations on GitHub.com. Instead, token-based authentication (for example, personal access, OAuth, SSH Key, or GitHub App installation token) will be required for all authenticated Git operations.

Please refer to this blog post for instructions on what you need to do to continue using git operations securely.

简单来说,就是从 2021 年 8 月 13 号开始,在 GitHub.com 上任何授权 Git 的行为都不再支持密码验证了,请用使用基于 token 的授权方式来替代,比如说:personal access, OAuth, SSH Key, or GitHub App installation token

(以下为正文)


下面就是基于我的 macOS 系统完成的配置并解决文体,仅供参考。

FYI:如果你想自己学着根据官方材料配置的话,可以参考如下两个链接就足够了。

总共两步:

第一步,先到 GitHub 个人设置里设置 token。

  • 打开自己的 GitHub主页,点击自己的头像找到 Settings 并进入,在左边目录栏找到 Developer settings - Personal access tokens,点击 Generate new token,按照步骤申请即可,过程简单。Scopes(范围)那里建议全选。

Token 申请成功后,一定记得要复制保存下来这个 token 字符串,因为这是你最后一次看到它的机会,忘记了的话 GitHub.com 是不会给你查找的,你只能重新申请新的。

第二步,在 macOS 的钥匙串访问里修改 GitHub 的密码为刚刚获得的 token。

用 Spotlight 或者 Alfred 等找到钥匙串访问 (Keychain Access),并打开。

右上方搜索 github.com,然后种类类别里找到 互联网密码(internet password),双击这个条目打开以后,在密码处填入第一步中复制保存下来的 token,最后存储更改即可。

Bayes Inference, Bayes Factor, Model Selection

2021年2月2日 08:00
模型选择 (model selection) 是统计推断专题里一个很重要的话题,对于引力波数据处理,引力波天文学和多信使天文学来说,尤是如此。此文是于 2021.1.19 在 ITP-CAS 为 Journal Club 准备的一个整理与调研。算是自己对 model selection 在个人在当前理解程度上的一个记录。

Bayes Inference

A primary aim of modern Bayesian inference is to construct a posterior distribution $$ p(\theta|d) $$ where $\theta$ is the set of model parameters and $d$ is the data associated with a measurement1. The posterior distribution $p(\theta|d)$ is the probability density function for the continuous variable $\theta$ (i.e. 15 parameters describing a CBC) given the data $d$ (strain data from a network of GW detectors). The probability that the true value of $\theta$ is between $(\theta’,\theta’+d\theta)$ is given by $p(\theta’|d)d\theta’$. It is normalised so that $$ \int d\theta p(\theta|d) = 1 . $$

According to Bayes theorem, the posterior distribution for multimessenger astrophysics is given by $$ p(\theta \mid d)=\frac{\mathcal{L}(d \mid \theta) \pi(\theta)}{\mathcal{Z}} $$ where $\mathcal{L}(d \mid \theta)$ is the likelihood function of the data given the $\theta$, $\pi(\theta)$ is the prior distribution of $\theta$, and $\mathcal{Z}$ is a normalisation factor called the ‘evidence’: $$ \mathcal{Z} \equiv \int d \theta \mathcal{L}(d \mid \theta) \pi(\theta) . $$

The likelihood function is something that we choose. It is a description of the measurement. By writing down a likelihood, we implicitly introduce a noise model. For GW astronomy, we typically assume a Gaussian-noise likelihood function that looks something like this $$ \mathcal{L}(d \mid \theta)=\frac{1}{\sqrt{2 \pi \sigma^{2}}} \exp \left(-\frac{1}{2} \frac{(d-\mu(\theta))^{2}}{\sigma^{2}}\right) , $$ where $\mu(\theta)$ is a template for the gravitational strain waveform given $\theta$ and $\sigma$ is the detector noise. Note the $\pi$ with no parantheses and no subscript is the mathematical constent, not a prior distribution. This likelihood function reflects our assumption that the noise in GW detctors is Gaussian. Note that the likelihood function is not normalised with respect the $\theta$ and so $\int d\theta\mathcal{L}(d|\theta)\neq1$.

In practical terms, the evidence is a single number. It usually does not mean anything by itself, but becomes useful when we compare one evidence with another evidence. Formally, the evidence is a likelihood function. Specifically, it is the completely marginalised likelihood function. It is therefore sometimes denoted $\mathcal{L}(d)$ with no $\theta$ dependence. However, we prefer to use $\mathcal{Z}$ to denote the fully marginalised likelihood function.


Bayes Factor

Recall from probability theory:

We wish to distinguish between two hypotheses: $\mathcal{H}_{1}, \mathcal{H}_{2}$.

Bayes’s theorm can be expressed in a form mare convenient for our pruposes by employing the completeness relationship. Using the completeness relation (note: $P\left(\mathcal{H}_{1}\right)+P\left(\mathcal{H}_{2}\right)=1$), we find that the probability that $\mathcal{H}_{1}$ is true given that $d$ is true is $$ \begin{aligned} P(\mathcal{H}_{1} \mid d) &=\frac{P(\mathcal{H}_{1}) P(d \mid \mathcal{H}_{1})}{P(d)}\\ &=\frac{P\left(\mathcal{H}_{1}\right) P\left(d \mid \mathcal{H}_{1}\right)}{P\left(d \mid \mathcal{H}_{1}\right) P\left(\mathcal{H}_{1}\right)+P\left(d \mid \mathcal{H}_{2}\right) P\left(\mathcal{H}_{2}\right)} \\ &=\frac{\mathcal{\Lambda}\left(\mathcal{H}_{1} \mid d\right)}{\mathcal{\Lambda}\left(\mathcal{H}_{1} \mid d\right)+P\left(\mathcal{H}_{2}\right) / P\left(\mathcal{H}_{1}\right)} \\ &=\frac{\mathcal{O}\left(\mathcal{H}_{1} \mid d\right)}{\mathcal{O}\left(\mathcal{H}_{1} \mid d\right)+1} \end{aligned} $$ where we have defined likelihood ratio $\mathcal{\Lambda}$ and odds ratio $\mathcal{O}$: $$ \begin{aligned} \mathcal{\Lambda}\left(\mathcal{H}_{1} \mid d\right) &:=\frac{P\left(d \mid \mathcal{H}_{1}\right)}{P\left(d \mid \mathcal{H}_{0}\right)} \\ \mathcal{O}\left(\mathcal{H}_{1} \mid d\right) &:=\frac{P\left(\mathcal{H}_{1}\right)}{P\left(\mathcal{H}_{0}\right)} \mathcal{\Lambda}\left(\mathcal{H}_{1} \mid d\right) = \frac{P\left(\mathcal{H}_{1}\right)}{P\left(\mathcal{H}_{0}\right)} \frac{P\left(d \mid \mathcal{H}_{1}\right)}{P\left(d \mid \mathcal{H}_{0}\right)} \end{aligned} $$

The ratio of the evidence for two different models is called the Bayes factor. For example, we can compare the evidence for a BBH waveform predicted by general relativity (model $M_A$ with parameters $\theta$ ) with a BBH waveform predicted by some other theory (model $M_B$ with parameters $\nu$): $$ \begin{aligned} \mathcal{Z}_{A}&=\int d \theta \mathcal{L}\left(d \mid \theta, M_{A}\right) \pi(\theta) ,\\ \mathcal{Z}_{B}&=\int d \nu \mathcal{L}\left(d \mid \nu, M_{B}\right) \pi(\nu) . \end{aligned} $$ The A/B Bayes factor is $$ \mathrm{BF}_{B}^{A}=\frac{\mathcal{Z}_{A}}{\mathcal{Z}_{B}} . $$ Note that the number of parameters in $\nu$ can be different from the number of parameters in $\theta$.

Formally, the correct metric to compare two models is not the Bayes factor, but rather the odds ratio. The odds ratio is the product of the Bayes factor with the prior odds $\pi_A/\pi_B$, which describes our likelihood of hypotheses $A$ and $B$: $$ \mathcal{O}_{B}^{A} \equiv \frac{\mathcal{Z}_{A}}{\mathcal{Z}_{B}} \frac{\pi_{A}}{\pi_{B}} =\frac{\pi_{A}}{\pi_{B}} \mathrm{BF}_{B}^{A} $$

In many practical applications, we set the prior odds ratio to unity, and so the odds ratio is the Bayes factor. This practice is sensible in many applications where our intuition tells us: until we do this measurement both hypotheses are equally likely.

There are some (fairly uncommon) examples where we might choose a different prior odds ratio. For example, we may construct a model in which general relativity (GR) is wrong. We may further suppose that there are multiple different ways in which it could be wrong, each corresponding to a different GR-is-wrong sub-hypothesis. If we calculated the odds ratio comparing one of these GR-is-wrong sub-hypotheses to the GR-is-right hypothesis, we would not assign equal prior odds to both hypotheses. Rather, we would assign at most 50% probability to the entire GR-is-wrong hypothesis, which would then have to be split among the various sub-hypotheses.


Model Selection

Bayesian evidence encodes two pieces of information:

  1. The likelihood tells us how well our model fits the data.
  2. The act of marginalisation tell us about the size of the volume of parameter space we used to carry out a fit.

This creates a sort of tension:

We want to get the best fit possible (high likelihood) but with a minimum prior volume.

A model with a decent fit and a small prior volume often yields a greater evidence than a model with an excellent fit and a huge prior volume.

In these cases, the Bayes factor penalises the more complicated model for being too complicated.

How to understand this comments?

We can obtain some insights into the model evidence by making a simple approximation to the integral over parameters:

  • Consider first the case of a model having a single parameter $w$. ($N=1$)
  • Assume that the posterior distribution is sharply peaked around the most probable value $w_\text{MAP}$, with width $\Delta w_\text{posterior}$, then we can approximate the in- tegral by the value of the integrand at its maximum times the width of the peak.
  • Assume that the prior is flat with width $\Delta w_\text{prior}$ so that $\pi(w) = 1/\Delta w_\text{prior}$.
A rough approximation to the model evidence if we assume that the posterior distribution over parameters is sharply peaked around its mode $w_\text{MAP}$ .
A rough approximation to the model evidence if we assume that the posterior distribution over parameters is sharply peaked around its mode $w_\text{MAP}$ .

then we have $$ \mathcal{Z} \equiv \int d w \mathcal{L}(d \mid w) \pi(w) \simeq \mathcal{L}\left(\mathcal{d} \mid w_{\mathrm{MAP}}\right) \frac{\Delta w_{\text {posterior }}}{\Delta w_{\text {prior }}} $$ and so taking logs we obtain (for a model having a set of $N$ parameters) $$ \ln \mathcal{Z}(\mathcal{d}) \simeq \ln \mathcal{L}\left(\mathcal{d} \mid w_{\mathrm{MAP}}\right)+N\ln \left(\frac{\Delta w_{\text {posterior }}}{\Delta w_{\text {prior }}}\right) $$

  • The first term: the fit to the data given by the most probable parameter values, and for a flat prior this would correspond to the log likelihood.
  • The second term (also called Occam factor) penalizes the model according to its complexity. Because $\Delta w_\text{posterior}<\Delta w_\text{prior}$ this term is negative, and it increases in magnitude as the ratio $\Delta w_\text{posterior}/\Delta w_\text{prior}$ gets smaller. Thus, if parameters are finely tuned to the data in the posterior distribution, then the penalty term is large.

Thus, in this very simple approximation, the size of the complexity penalty increases linearly with the number $N$ of adaptive parameters in the model. As we increase the complexity of the model, the first term will typically decrease, because a more complex model is better able to fit the data, whereas the second term will increase due to the dependence on $N$. The optimal model complexity, as determined by the maximum evidence, will be given by a trade-off between these two competing terms. We shall later develop a more refined version of this approximation, based on a Gaussian approximation to the posterior distribution.

A further insight into Bayesian model comparison and understand how the marginal likelihood can favour models of intermediate complexity by considering the Figure below.

Schematic illustration of the distribution of data sets for three models of different complexity, in which $M_1$ is the simplest and $M_3$ is the most complex. Note that the distributions are normalized. In this example, for the particular observed data set $\mathcal{D}_0$, the model $M_2$ with intermediate complexity has the largest evidence (the area und curve) .
Schematic illustration of the distribution of data sets for three models of different complexity, in which $M_1$ is the simplest and $M_3$ is the most complex. Note that the distributions are normalized. In this example, for the particular observed data set $\mathcal{D}_0$, the model $M_2$ with intermediate complexity has the largest evidence (the area und curve) .
Here the horizontal axis is a one-dimensional representation of the space of possible data sets, so that each point on this axis corresponds to a specific data set. We now consider three models $M_1$, $M_2$ and $M_3$ of successively increasing complexity. Imagine running these models generatively to produce example data sets, and then looking at the distribution of data sets that result. Any given model can generate a variety of different data sets since the parameters are governed by a prior probability distribution, and for any choice of the parameters there may be random noise on the target variables. To generate a particular data set from a specific model, we first choose the values of the parameters from their prior distribution $\pi(w)$, and then for these parameter values we sample the data from $\mathcal{L}(\mathcal{D}|w)$. Because the distributions $\mathcal{L}(\mathcal{D}|M_i)$ are normalized, we see that the particular data set $D_0$ can have the highest value of the evidence for the model of intermediate complexity. Essentially, the simpler model cannot fit the data well, whereas the more complex model spreads its predictive probability over too broad a range of data sets and so assigns relatively small probability to any one of them.

More insights:

  • If we compare two models where one model is a superset of the other—for example, we might compare GR and GR with non-tensor modes—and if the data are better explained by the simpler model, the log Bayes factor is typically modest, $$ |\log \mathrm{BF}| \approx (1,2). $$ Thus, it is difficult to completely rule out extensions to existing theories. We just obtain ever tighter constraints on the extended parameter space.

  • To make good use of Bayesian model comparison, we fully specify priors that are independent of the current data $\mathcal{D}$.

  • The sensitivity of the marginal likelihood to the prior range depends on the shape of the prior and is much greater for a uniform prior than a scale-invariant prior (see e.g. Gregory, 2005b, 61).

  • In most instances we are not particularly interested in the Occam factor itself, but only in the relative probabilities of the competing models as expressed by the Bayes factors. Because the Occam factor arises automatically in the marginalisation procedure, its effects will be present in any model-comparison calculation.

  • No Occam factors arise in parameter-estimation problems. Parameter estimation can be viewed as model comparison where the competing models have the same complexity so the Occam penalties are identical and cancel out.

  • On average the Bayes factor will always favour the correct model.

    To see this, consider two models $\mathcal{M}_\text{true}$ and $\mathcal{M}_1$ in which the truth corresponds to $\mathcal{M}_\text{true}$. We assume that the true posterior distribution from which the data are considered is contained within the set of models under consideration. For a given finite data, it is possible for the Bayes factor to be larger for the incorrect model. However, if we average the Bayes factor over the distribution of data sets, we obtain the expected Bayes factor in the form $$ \int \mathcal{Z}\left(\mathcal{D} \mid \mathcal{M}_{true}\right) \ln \frac{\mathcal{Z}\left(\mathcal{D} \mid \mathcal{M}_{true}\right)}{\mathcal{Z}\left(\mathcal{D} \mid \mathcal{M}_{1}\right)} \mathrm{d} \mathcal{D} $$ where the average has been taken with respect to the true distribution of the data. This is an example of Kullback-Leibler divergence and satisfies the property of always being positive unless the two distributions are equal in which case it is zero. Thus on average the Bayes factor will always favour the correct model.

We have seen from Figure 1 that the model evidence can be sensitive to many aspects of the prior, such as the behaviour in the tails. Indeed, the evidence is not defined if the prior is improper, as can be seen by noting that an improper prior has an arbitrary scaling factor (in other words, the normalization coefficient is not defined because the distribution cannot be normalized). If we consider a proper prior and then take a suitable limit in order to obtain an improper prior (for example, a Gaussian prior in which we take the limit of infinite variance) then the evidence will go to zero, as can be seen Figure 1 and the equation below the Figure 1. It may, however, be possible to consider the evidence ratio between two models first and then take a limit to obtain a meaningful answer.

In a practical application, therefore, it will be wise to keep aside an independent test set of data on which to evaluate the overall performance of the final system.


Reference


  1. By referring to model parameters, we are implicitly acknowledging that we begin with some model. Some authors make this explicit by writing the posterior as $p(\theta|d, M)$, where $M$ is the model. (Other authors sometimes use $I$ to denote the model.) We find this notation clunky and unnecessary since it goes without saying that one must always assume some model. If/when we consider two distinct models, we add an additional variable to denote the model. ↩︎

谱分析 (spectral analysis) 的 SciPy 代码解析

2021年1月14日 08:00

由于个人研究课题的需要,我仔细的研读了 Scipy.signal.spectral源码。此文就是关于此源码的详细解析教程,以方便我未来回溯相关谱分析 (spectral analysis) 的细节,也通过阅读成熟且优美的源代码提高自己的 Python 编程开发能力。内容涉及:stft, istft, csd, welch, coherence, periodogram, spectrogram, check_COLA, check_NOLA, lombscargle

PS: 目前主要总结了 STFT 短时傅里叶变换的源码解析,其他的待续吧~

SciPy

SciPy (pronounced “Sigh Pie”) 是 Python 中非常重要的一个开源的数学、科学和工程计算包。

SciPy (pronounced “Sigh Pie”) is open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.


Spectral analysis

在 Signal processing (scipy.signal) 的 API 中,本文将会针对频谱分析 (spectral analysis) 的源码进行详细解读。

API Description
periodogram(x[, fs, window, nfft, detrend, …]) Estimate power spectral density using a periodogram.
welch(x[, fs, window, nperseg, noverlap, …]) Estimate power spectral density using Welch’s method.
csd(x, y[, fs, window, nperseg, noverlap, …]) Estimate the cross power spectral density, Pxy, using Welch’s method.
coherence(x, y[, fs, window, nperseg, …]) Estimate the magnitude squared coherence estimate, Cxy, of discrete-time signals X and Y using Welch’s method.
spectrogram(x[, fs, window, nperseg, …]) Compute a spectrogram with consecutive Fourier transforms.
lombscargle(x, y, freqs) Computes the Lomb-Scargle periodogram.
vectorstrength(events, period) Determine the vector strength of the events corresponding to the given period.
stft(x[, fs, window, nperseg, noverlap, …]) Compute the Short Time Fourier Transform (STFT).
istft(Zxx[, fs, window, nperseg, noverlap, …]) Perform the inverse Short Time Fourier transform (iSTFT).
check_COLA(window, nperseg, noverlap[, tol]) Check whether the Constant OverLap Add (COLA) constraint is met
check_NOLA(window, nperseg, noverlap[, tol]) Check whether the Nonzero Overlap Add (NOLA) constraint is met

scipy/signal/aspectral.py

首先要说明的是,我们会避轻就重的研究清楚各个算法是如何实现的,暂时忽略数据结构的识别等无关紧要的细节,直击和理解算法进行的逻辑和执行的动机。

spectrial.py 脚本 (SciPy v1.6.0) 中,共有函数定义 (def) 14 个,其中只有 10 个核心函数是 __all__ 的成员函数。他们之间的引用关系,可见下图:

`scipy/signal/spectral.py`
scipy/signal/spectral.py

其中蓝色框子里定义的是核心函数,白色框子是辅助函数,其他就是模块引用函数。

从图中的依赖关系就可以看出来,_spectral_helper 是最重要的底层辅助函数。下面我们将会以短时傅里叶变换 (stft) 为例,来梳理其是如何实现的。

stft

以一段数据为例,先看一下在 stft 一定参数下的执行效果:

import numpy as np
import matplotlib.pyplot as plt
import scipy.signal as signal

fname = './demo.dat'
data = np.loadtxt(fname)
t, hp, _ = data[:,0], data[:,1], data[:,2]

fs = 1/(t[1]-t[0]) # sampling points 采样率
print(fs, hp.size)
# output
# (16384, 32613)

hp 是我们的目标数据,采样率 fs 是 16384Hz。经过指定参数后的短时傅里叶变换 (Short Time Fourier Transform (STFT)) 后,可以得到如下结果:

freqs, time, Zxx = signal.spectral.stft(hp, fs=fs,
 nfft=fs//4, nperseg=fs//4, noverlap=fs//8,)
print(freqs.size, time.size, Zxx.shape)
# output
# 2049 17 (2049, 17)

根据函数 signal.spectral.stft 输出的频率分辨率 (freqs)、时间分辨率 (time) 和能谱矩阵 (Zxx) 结果,可以绘制该段数据的时间-频率图像。

plt.pcolormesh(time, freqs, np.abs(Zxx), #shading='gouraud' #开启这个参数就会图像插值,显示会更光滑
 )
plt.title('STFT Magnitude')
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.ylim(fmin, fmax)
plt.colorbar()
plt.tight_layout()
plt.show()

那么这个图像究竟是如何实现的呢?

我们将函数底层调用会用的到默认关键词参数,以及会引用到工具函数总结如下:

### stft ####################################################################
def stft(x, fs=1.0, window='hann', nperseg=256, noverlap=None, nfft=None,
 detrend=False, return_onesided=True, boundary='zeros', padded=True,
 axis=-1):
 freqs, time, Zxx = _spectral_helper(x, x, fs, window, nperseg, noverlap,
 nfft, detrend, return_onesided,
 scaling='spectrum', axis=axis,
 mode='stft', boundary=boundary,
 padded=padded)
 return freqs, time, Zxx
#############################################################################
def _spectral_helper(x, y, fs=1.0, window='hann', nperseg=None, noverlap=None,
 nfft=None, detrend='constant', return_onesided=True,
 scaling='density', axis=-1, mode='psd', boundary=None,
 padded=False):
 #... (暂略)
 pass
 return freqs, time, result

由此可见,stft 对数据 x 的处理,等价于对底层辅助函数 _spectral_helper 的输入参数 xy 取一致即可。下面,我们就开始对 _spectral_helper 函数探究其实如何给出输出结果 freqstimeZxx的。

_spectral_helper

x = hp.copy()
print('x.size = {}, fs = {}'.format(x.size, fs))
# output
# x.size = 32613, fs = 16384
  • Preparing

    _spectral_helper 的内部,前期是很多关于配置和调用的准备工作。在形式上是:

    # Preparing
    mode = 'stft' # or psd
    same_data = True
    outdtype = np.result_type(x, np.complex64) # 明确数据精度
    axis = -1
    window='hann'
    boundary='zeros'
    padded = True # not used
    detrend = False # not used
    scaling='spectrum' # or density
    return_onesided, sides=True, 'onesided'
    
    from scipy.signal._arraytools import const_ext, even_ext, odd_ext, zero_ext
    boundary_funcs = {'even': even_ext,
     'odd': odd_ext,
     'constant': const_ext,
     'zeros': zero_ext,
     None: None}
    

    boundary_funcs 是考虑对数据 x 将会以什么样的形式来延拓其边界。后面很快会再讨论到这一点。

  • Finetune

    下面的参数是比较重要的可调参数:

    ## Finetune ## 
    
    # parse window; nperseg = win.shape
    nperseg = nperseg=fs//4
    nperseg = int(nperseg)
    print('nperseg = {}'.format(nperseg))
    
    # parse window; if array like, then set nperseg = win.shape
    win, nperseg = signal.spectral._triage_segments(window, nperseg, input_length=x.shape[-1])
    print('nperseg = {}, win.size = {}'.format(nperseg, win.size))
    # win have the same dtype with outdtype
    if np.result_type(win, np.complex64) != outdtype:
     win = win.astype(outdtype)
    
    # nfft must be greater than or equal to nperseg.
    nfft = int(fs//4)
    print('nfft = {}'.format(nfft))
    
    # noverlap must be less than nperseg.
    noverlap = int(fs//8)
    print('noverlap = {}'.format(noverlap))
    
    nstep = nperseg - noverlap
    print('nstep = {}'.format(nstep))
    
    ## output ##
    # nperseg = 4096
    # nperseg = 4096, win.size = 4096
    # nfft = 4096
    # noverlap = 2048
    # nstep = 2048
    
    • nperseg 表示的是要解析的短时窗口序列的长度 (number of per segment)。
    • nfft 表示的是对每个短时窗口序列进行傅里叶变换 fft 时,所需要的长度 (num of fft)。也就是 nfft 点的离散傅里叶变换 (nfft-point discrete Fourier Transform (DFT))。所以,要求 nfft $\ge$ nperseg
    • noverlap 表示的是每个相邻短时窗口序列之间的重叠长度 (num of overlap)。一般来说通常取定为 nperseg//2,即 50% 的窗口重叠。
    • nstep 是间接计算出来的参数 (num of step),为了之后计算窗口滑动数目带来便利。显然有 nperseg = nstep + noverlap
    • 至于 win 就是每个短时窗口序列所对应的窗函数序列啦。根据指定的窗函数 windownperseg 和输入数据 x 的长度来判断和给出与 nperseg 相对应长度一直窗函数序列。其是通过辅助函数 _triage_segments 来实现的,比较简单,详情可以瞅一眼附录的 _triage_segments 即可。
  • Boundary extension + padding

    正式计算之前,还需要对我们的完整数据 x 进行边界延拓和补零的操作。下面的注释中也给出了这样操作偶的理由。

    # Padding occurs after boundary extension, so that the extended signal ends
    # in zeros, instead of introducing an impulse at the end.
    # I.e. if x = [..., 3, 2]
    # extend then pad -> [..., 3, 2, 2, 3, 0, 0, 0]
    # pad then extend -> [..., 3, 2, 0, 0, 0, 2, 3]
    
    print('x.size = {}'.format(x.size))
    
    # boundary extension
    x = boundary_funcs[boundary](x, nperseg//2, axis=-1)
    print('x.size = {} | {}'.format(x.size, hp.size + nperseg ))
    
    # Pad to integer number of windowed segments
    # I.e make x.shape[-1] = nperseg + (nseg-1)*nstep, with integer nseg
    nadd = (-(x.shape[-1]-nperseg) % nstep) % nperseg
    zeros_shape = list(x.shape[:-1]) + [nadd]
    x = np.concatenate((x, np.zeros(zeros_shape)), axis=-1)
    print('x.size = {} | {}'.format(x.size, x.size % nperseg ))
    
    ## output ##
    # x.size = 32613
    # x.size = 36709 | 36709
    # x.size = 36864 | 0
    
    • 先进行边界延拓,然后再补零。
    • 在默认参数boundary='zeros'下,边界延拓事实上是调用了外部函数 zero_ext 来实现的。延拓效果是在数据 x 的基础上,向各两侧分别补零 nperseg//2 长度。其他的延拓方法详情可以查看附录部分 (const_ext, even_ext, odd_ext, zero_ext)
    • 补零的目标是为了让最终的数据 x 长度可以被 nstep 整除,详情和细节可以看下面我自己绘制的示意图。其中,核心是利用Python 的负整除来计算出变量 add,另外,给予我的理解 % nperseg 这一步其实没有作用,因为被整除的部分总是一个小于 nperseg 的正整数。
`spectral_helper`: boundary extension + padding
spectral_helper: boundary extension + padding

到此就算是完成了所有关于数据和参数的准备工作,下一步就是关键了。

接下来就是对每一个 segment 进行傅里叶变换!

_fft_helper

首先要说的是,在短时傅里叶变换 stfd 中,默认是不考虑趋势去除 detrending 的。去掉趋势是振动信号的一种常见预处理方法,其可以有效降低环境干扰所带来的的不稳定性。对于我们这个"很短"数据段的时频图来说,一般不考虑。

Scipy 中定义了一个 _fft_helper 函数来一步到位计算给定数据下,各个加窗的短时数据序列的频谱信息。

# Perform the windowed FFTs
result = _fft_helper(x, win, detrend_func, nperseg, noverlap, nfft, sides)

def _fft_helper(x, win, detrend_func, nperseg, noverlap, nfft, sides):
 pass # ... (暂略)
 return result

Scipy 中原代码的实现其实非常 pythonic 风格的方法,那就是将数据 x 映射为一个数据矩阵,每行为每一个待处理的数据 segment,每列都是相同的 nperseg

  • Way 1

    # Created strided array of data segments
    if nperseg == 1 and noverlap == 0:
     result = x[..., np.newaxis]
    else:
     # https://stackoverflow.com/a/5568169
     step = nperseg - noverlap
     shape = x.shape[:-1]+((x.shape[-1]-noverlap)//step, nperseg)
     strides = x.strides[:-1]+(step*x.strides[-1], x.strides[-1])
     result = np.lib.stride_tricks.as_strided(x, shape=shape,
     strides=strides)
    

    这个方法看不太懂,也不太好理解,似乎是某种记录数值位的方式实现的。可以肯定的是效率很高。下面给个例子,感受下一般:

    ## Example
    >>> y = np.arange(1, 30+1, 1)
    >>> print(y)
    >>> shape = y.shape[:-1]+((y.shape[-1]-2)//5, 6)
    >>> print('shape = {} | y.shape[:-1] = {}'.format(shape, x.shape[:-1]))
    >>> strides = y.strides[:-1]+(5*y.strides[-1], y.strides[-1])
    >>> print('strides = {} | y.strides[:-1] = {}'.format(strides, y.strides[:-1], ))
    >>> np.lib.stride_tricks.as_strided(y, shape=shape,
     strides=strides)
    
    [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
     25 26 27 28 29 30]
    shape = (5, 6) | y.shape[:-1] = ()
    strides = (40, 8) | y.strides[:-1] = ()
    
    array([[ 1, 2, 3, 4, 5, 6],
     [ 6, 7, 8, 9, 10, 11],
     [11, 12, 13, 14, 15, 16],
     [16, 17, 18, 19, 20, 21],
     [21, 22, 23, 24, 25, 26]])
    
  • Way 2

    用我自己熟悉且容易理解的方式重新写这一部分:

    Slices = [ (i*nstep , i*nstep+nperseg) for i in range(x.size) if i*nstep+nperseg <= x.size]
    result = np.asarray([x[s[0]:s[1]] for s in Slices])
    

    可以看到,我无非是记录下每一个 segment 的前后 index,然后切片罢了。

  • 加窗+傅里叶变换

    当你的时频矩阵 result 已经准备好了以后,接下来就是充分利用 python 中的 “广播机制”,对每一行加窗,并且做 “nfft"-点实数域的离散傅里叶变换即可。

    # Detrend each data segment individually
    result = detrend_func(result) # 对 stft 来说,这就是个恒等映射。
    
    # Apply window by multiplication
    result = win * result
    
    # Perform the fft.
    # 1-D *nfft*-point discrete Fourier Transform (DFT) of a real-valued array 
    result = scipy.fft.rfft(result.real, n=nfft)
    print('result.shape = {} | nfft.shape = {}'.format(result.shape, nfft))
    ## outut ##
    # result.shape = (17, 2049) | nfft.shape = 4096
    
    • 得到的时频矩阵 result 显然行数是时间分辨率,纵轴对应的是频域分辨率,它取决于参数 nfft,间接的也却取决于 nperseg

    • 上述是基于单边 “one sided” 的情况,若对应于 “two sided”,则要改用 scipy.fft.fft 函数。

    • _fft_helper 到此开始输出 result

在最后的输出之前,_spectral_helper 还要做三件事:

  1. 调整时频矩阵 result 的数值单位。可以通过参数 scaling 来调节:

    if scaling == 'density':
     scale = 1.0 / (fs * (win*win).sum())
    elif scaling == 'spectrum': # stft 默认用频谱 spectrum
     scale = 1.0 / win.sum()**2
    else:
     raise ValueError('Unknown scaling: %r' % scaling)
    
    if mode == 'stft':
     scale = np.sqrt(scale)
    
    result *= scale
    
    • density: Select Power the loss of power is compensated. The ratio of the sum of the squared data before and after applying the window is used as a normalization factor. The total power within the spectrum therefore always corresponds to the power of the data before applying the window.

    • spectrum: Selecting Amplitude normalizes to the gain of the used window function, i.e. the sum of all values is divided by their number. This compensates the damping of the amplitudes caused by applying the window. This is especially useful to measure peaks within the spectrum.

    • 这里有一个非常棒的材料讲各类 STFT 的频谱单位:

    • 其他情况 (onsesided, psd, same_data)下,对时频矩阵的处理:

      if sides == 'onesided' and mode == 'psd': # not used for stft
       if nfft % 2:
       result[..., 1:] *= 2
       else:
       # Last point is unpaired Nyquist freq point, don't double
       result[..., 1:-1] *= 2
      
      # All imaginary parts are zero anyways
      if same_data and mode != 'stft':
       result = result.real
      
  2. 输出时间分辨率序列。

    time = np.arange(nperseg/2, x.shape[-1] - nperseg/2 + 1,
     nperseg - noverlap)/float(fs)
    if boundary is not None:
     time -= (nperseg/2) / fs
    

    仔细观察就可以知道,这对应的是每一个 segment 的中心点处的时刻。

  3. 输出频率分辨率序列。

    if sides == 'twosided':
     freqs = scipy.fft.fftfreq(nfft, 1/fs)
    elif sides == 'onesided':
     freqs = scipy.fft.rfftfreq(nfft, 1/fs) # stft
    

    简而言之,就是给定离散傅里叶变换究竟是几点的,并且要说明采样率信息(可知 segment 对应的时长信息),即可给出在每个 segment 上傅里叶变换后有着怎样的频率分量。

    我也可以手撸一把,给出对应于 scipy.fft.rfftfreq 更加直接的含义:

    freqs = np.arange(0, 1/2+1/nfft, 1/nfft)*fs # when `nfft` is even
    
    • np.arange 部分是目标给出从0 到 1/2 之间,以 1/nfft 作为分辨率间隔的等差序列。
    • *fs 表示的是要在 0 到 Nyquist frequency 范围内的频率分辨率。

stft 已讲解完)

csd

Estimate the cross power spectral density (csd), Pxy, using Welch’s method.

在上述介绍的 _spectral_helper 的基础上,就可以实现互功率谱密度(cross power spectral density),两个频域函数之间的功率谱密度。其实部为共谱密度(简称“共谱”),虚部为正交谱密度。

由于随机信号是时域无限信号,不具备可积分条件,因此不能直接进行傅里叶变换。又因为随机信号的频率、幅值、相位都是随机的, 因此从理论上讲,一般不作幅值谱和相位谱分析,而是用具有统计特性的功率谱密度 (power spectral density) 来作谱分析。

  • 自功率谱密度函数 (Auto-power spectral density function)

    • 平稳随机过程的功率谱密度与自相关函数是一傅里叶变换偶对 (fourier transform dual pair) $$ \begin{array}{l} S_{x}(\omega)=\int_{-\infty}^{\infty} R_{x}(\tau) e^{-j \omega \tau} d \tau \ R_{x}(\tau)=\int_{-\infty}^{\infty} S_{x}(\omega) e^{j \omega \tau} d \omega \end{array} $$ $x(t)$ 是零均值的随机信号,且 $x(t)$ 无周期性分量,其自相关函数 $R_x(\tau\rightarrow\infty)=0$,自相关函数满足傅里叶变换条件 $\int_{-\infty}^{\infty}\left|R_{x}(\tau)\right| d \tau<0$。

    • 性质:实偶函数+双边谱;单边谱 $G_x(\omega)$ (非负频率上的谱) $$ G_{x}(\omega) =2 S_{x}(\omega) =2 \int_{-\infty}^{\infty} R_{x}(\tau) e^{-j \omega \tau} d \tau \quad(\omega>0) $$

    • 物理意义:信号的能量在不同频率成分上的分布。

    • 自功率谱密度函数与幅值谱 (amplitude spectrum) 的关系: $$ S_{x}(f)=\lim _{T \rightarrow \infty} \frac{1}{2 T}|X(f)|^{2} $$

  • 互功率谱密度函数 (cross-power spectral density function)

    • 定义:(互相关函数满足傅里叶变换条件 $\int_{-\infty}^{\infty}\left|R_{x y}(\tau)\right| d \tau<\infty$)

    $$ \begin{array}{l} S_{x y}(\omega)=\int_{-\infty}^{\infty} R_{x y}(\tau) e^{-j \omega \tau} d \tau \ R_{x y}(\tau)=\frac{1}{2 \pi} \int_{-\infty}^{\infty} S_{x y}(\omega) e^{j \omega \tau} d \omega \end{array} $$

    • 单边互谱密度函数 (one-sided cross-power spectrum) $$ G_{x y}(\omega)=2 \int_{-\infty}^{\infty} R_{x y}(\tau) e^{-j \omega \tau} d \tau \quad(0<\omega<\infty) $$
  • 谱相干函数 (spectral coherence function):

    • 测评输入、输出信号间的因果性,即输出信号的功率谱中有多少是所测试输入量引起的响应。

    $$ \gamma_{x y}^{2}(\omega)=\frac{\left|G_{x y}(\omega)\right|^{2}}{G_{x}(\omega) G_{y}(\omega)} $$

    • $\gamma_{x y}^{2}(\omega)=1$ $y(t)$ 和 $x(t)$ 完全相关
    • $\gamma_{x y}^{2}(\omega)=0$ $y(t)$ 和 $x(t)$ 完全无关
    • $1>\gamma_{x y}^{2}(\omega)>0$ $y(t)$ 和 $x(t)$ 部分相关
      • 测试中有外界干扰
      • 输出 $y(t)$ 是输入 $x(t)$ 和其他输入的综合输出
      • 联系 $x(t)$ 和 $y(t)$ 的系统是非线性的
  • 频率响应函数 (frequenct response function) $$ H(\omega)=\frac{G_{x y}(\omega)}{G_{x}(\omega)} $$

来个例子看看 csd 的实际效果是怎样的:

>>> from scipy import signal
>>> import matplotlib.pyplot as plt
# Generate two test signals with some common features.
>>> fs = 10e3
>>> N = 1e5
>>> amp = 20
>>> freq = 1234.0
>>> noise_power = 0.001 * fs / 2
>>> time = np.arange(N) / fs
>>> b, a = signal.butter(2, 0.25, 'low')
>>> x = np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
>>> y = signal.lfilter(b, a, x)
>>> x += amp*np.sin(2*np.pi*freq*time)
>>> y += np.random.normal(scale=0.1*np.sqrt(noise_power), size=time.shape)
# Compute and plot the magnitude of the cross spectral density.
>>> f, Pxy = signal.csd(x, y, fs, nperseg=1024)
>>> f, Pxx = signal.csd(x, x, fs, nperseg=1024)
>>> f, Pyy = signal.csd(y, y, fs, nperseg=1024)
>>> plt.semilogy(f, np.abs(Pxy), label = 'Pxy')
>>> plt.semilogy(f, np.abs(Pxx), label = 'Pxx')
>>> plt.semilogy(f, np.abs(Pyy), label = 'Pyy')
>>> plt.xlabel('frequency [Hz]')
>>> plt.ylabel('CSD [V**2/Hz]')
>>> plt.legend()
>>> plt.show()

其源码的实现:

def csd(x, y, fs=1.0, window='hann', nperseg=None, noverlap=None, nfft=None,
 detrend='constant', return_onesided=True, scaling='density',
 axis=-1, average='mean'):
 """Estimate the cross power spectral density, Pxy, using Welch's method."""
 freqs, _, Pxy = _spectral_helper(x, y, fs, window, nperseg, noverlap, nfft,
 detrend, return_onesided, scaling, axis,
 mode='psd')

 # Average over windows.
 if len(Pxy.shape) >= 2 and Pxy.size > 0:
 if Pxy.shape[-1] > 1:
 if average == 'median':
 Pxy = np.median(Pxy, axis=-1) / _median_bias(Pxy.shape[-1])
 elif average == 'mean':
 Pxy = Pxy.mean(axis=-1)
 else:
 raise ValueError('average must be "median" or "mean", got %s'
 % (average,))
 else:
 Pxy = np.reshape(Pxy, Pxy.shape[:-1])

 return freqs, Pxy

_median_bias

def _median_bias(n):
 """
 Returns the bias of the median of a set of periodograms relative to
 the mean.
 """
 ii_2 = 2 * np.arange(1., (n-1) // 2 + 1)
 return 1 + np.sum(1. / (ii_2 + 1) - 1. / ii_2)

welch (PSD)

Estimate power spectral density using Welch’s method.

Welch’s method 1 computes an estimate of the power spectral density (PSD) by dividing the data into overlapping segments, computing a modified periodogram for each segment and averaging the periodograms.

源码是很简单直接的:

def welch(x, fs=1.0, window='hann', nperseg=None, noverlap=None, nfft=None,
 detrend='constant', return_onesided=True, scaling='density',
 axis=-1, average='mean'):
 """Estimate power spectral density using Welch's method."""
 freqs, Pxx = csd(x, x, fs=fs, window=window, nperseg=nperseg,
 noverlap=noverlap, nfft=nfft, detrend=detrend,
 return_onesided=return_onesided, scaling=scaling,
 axis=axis, average=average)

 return freqs, Pxx.real

这就是所谓的 Welch 方法下的 PSD。看个例子,体会一般:

>>> from scipy import signal
>>> import matplotlib.pyplot as plt
>>> np.random.seed(1234)
# Generate a test signal, a 2 Vrms sine wave at 1234 Hz, corrupted by
# 0.001 V**2/Hz of white noise sampled at 10 kHz.
>>> fs = 10e3
>>> N = 1e5
>>> amp = 2*np.sqrt(2)
>>> freq = 1234.0
>>> noise_power = 0.001 * fs / 2
>>> time = np.arange(N) / fs
>>> x = amp*np.sin(2*np.pi*freq*time)
>>> x += np.random.normal(scale=np.sqrt(noise_power), size=time.shape)

现在来计算并看看这个数据的 PSD,以及噪声的功率密度:

# Compute and plot the power spectral density.
>>> f, Pxx_den = signal.welch(x, fs, nperseg=1024)
>>> plt.semilogy(f, Pxx_den)
>>> plt.ylim([0.5e-3, 1])
>>> plt.xlabel('frequency [Hz]')
>>> plt.ylabel('PSD [V**2/Hz]')
>>> plt.show()
# If we average the last half of the spectral density, to exclude the
# peak, we can recover the noise power on the signal.
>>> np.mean(Pxx_den[256:])
# 0.0009924865443739191

上面的例子中是默认参数 scaling='density',下面试一下 scaling='spectrum' 我们可以观察到 amplitude

# Now compute and plot the power spectrum.
>>> f, Pxx_spec = signal.welch(x, fs, 'flattop', 1024, scaling='spectrum')
>>> plt.figure()
>>> plt.semilogy(f, np.sqrt(Pxx_spec))
>>> plt.xlabel('frequency [Hz]')
>>> plt.ylabel('Linear spectrum [V RMS]')
>>> plt.show()
# The peak height in the power spectrum is an estimate of the RMS
# amplitude.
>>> np.sqrt(Pxx_spec.max())
# 2.0077340678640727

# If we now introduce a discontinuity in the signal, by increasing the
# amplitude of a small portion of the signal by 50, we can see the
# corruption of the mean average power spectral density, but using a
# median average better estimates the normal behaviour.
>>> x[int(N//2):int(N//2)+10] *= 50.
>>> f, Pxx_den = signal.welch(x, fs, nperseg=1024)
>>> f_med, Pxx_den_med = signal.welch(x, fs, nperseg=1024, average='median')
>>> plt.semilogy(f, Pxx_den, label='mean')
>>> plt.semilogy(f_med, Pxx_den_med, label='median')
>>> plt.ylim([0.5e-3, 1])
>>> plt.xlabel('frequency [Hz]')
>>> plt.ylabel('PSD [V**2/Hz]')
>>> plt.legend()
>>> plt.show()

coherence

Estimate the magnitude squared coherence estimate, Cxy, of discrete-time signals X and Y using Welch’s method.

Cxy = abs(Pxy)**2/(Pxx*Pyy), where Pxx and Pyy are power spectral density estimates of X and Y, and Pxy is the cross spectral density estimate of X and Y.

源码:

def coherence(x, y, fs=1.0, window='hann', nperseg=None, noverlap=None,
 nfft=None, detrend='constant', axis=-1):
 """
 Estimate the magnitude squared coherence estimate, Cxy, of
 discrete-time signals X and Y using Welch's method.
 """
 freqs, Pxx = welch(x, fs=fs, window=window, nperseg=nperseg,
 noverlap=noverlap, nfft=nfft, detrend=detrend,
 axis=axis)
 _, Pyy = welch(y, fs=fs, window=window, nperseg=nperseg, noverlap=noverlap,
 nfft=nfft, detrend=detrend, axis=axis)
 _, Pxy = csd(x, y, fs=fs, window=window, nperseg=nperseg,
 noverlap=noverlap, nfft=nfft, detrend=detrend, axis=axis)

 Cxy = np.abs(Pxy)**2 / Pxx / Pyy

 return freqs, Cxy

例子:

>>> from scipy import signal
>>> import matplotlib.pyplot as plt
# Generate two test signals with some common features.
>>> fs = 10e3
>>> N = 1e5
>>> amp = 20
>>> freq = 1234.0
>>> noise_power = 0.001 * fs / 2
>>> time = np.arange(N) / fs
>>> b, a = signal.butter(2, 0.25, 'low')
>>> x = np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
>>> y = signal.lfilter(b, a, x)
>>> x += amp*np.sin(2*np.pi*freq*time)
>>> y += np.random.normal(scale=0.1*np.sqrt(noise_power), size=time.shape)
# Compute and plot the coherence.
>>> f, Cxy = signal.coherence(x, y, fs, nperseg=1024)
>>> f, Pxx = signal.welch(x, fs, nperseg=1024)
>>> f, Pyy = signal.welch(y, fs, nperseg=1024)
>>> lnxy = plt.semilogy(f, Cxy, color = 'r', label = 'Cxy')
>>> plt.ylabel('Coherence')
>>> plt.twinx()
>>> lnx = plt.semilogy(f, Pxx, color = 'g', label = 'Pxx')
>>> lny = plt.semilogy(f, Pyy, color = 'b', label = 'Pyy')
>>> plt.ylabel('PSD [V**2/Hz]')
>>> plt.xlabel('frequency [Hz]')
>>> lns = lnxy+lnx+lny
>>> labs = [l.get_label() for l in lns]
>>> plt.legend(lns, labs)
>>> plt.show()

periodogram (PSD)

Estimate power spectral density using a periodogram.

下面是用 periodogram 计算 PSD 的源码:

def periodogram(x, fs=1.0, window='boxcar', nfft=None, detrend='constant',
 return_onesided=True, scaling='density', axis=-1):
 """Estimate power spectral density using a periodogram."""
 x = np.asarray(x)

 if x.size == 0:
 return np.empty(x.shape), np.empty(x.shape)

 if window is None:
 window = 'boxcar'

 if nfft is None:
 nperseg = x.shape[axis]
 elif nfft == x.shape[axis]:
 nperseg = nfft
 elif nfft > x.shape[axis]:
 nperseg = x.shape[axis]
 elif nfft < x.shape[axis]:
 s = [np.s_[:]]*len(x.shape)
 s[axis] = np.s_[:nfft]
 x = x[tuple(s)]
 nperseg = nfft
 nfft = None

 return welch(x, fs=fs, window=window, nperseg=nperseg, noverlap=0,
 nfft=nfft, detrend=detrend, return_onesided=return_onesided,
 scaling=scaling, axis=axis)

例子:

>>> from scipy import signal
>>> import matplotlib.pyplot as plt
>>> np.random.seed(1234)
# Generate a test signal, a 2 Vrms sine wave at 1234 Hz, corrupted by
# 0.001 V**2/Hz of white noise sampled at 10 kHz.
>>> fs = 10e3
>>> N = 1e5
>>> amp = 2*np.sqrt(2)
>>> freq = 1234.0
>>> noise_power = 0.001 * fs / 2
>>> time = np.arange(N) / fs
>>> x = amp*np.sin(2*np.pi*freq*time)
>>> x += np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
# Compute and plot the power spectral density.
>>> f, Pxx_den = signal.periodogram(x, fs)
>>> plt.semilogy(f, Pxx_den)
>>> plt.ylim([1e-7, 1e2])
>>> plt.xlabel('frequency [Hz]')
>>> plt.ylabel('PSD [V**2/Hz]')
>>> plt.show()
# If we average the last half of the spectral density, to exclude the
# peak, we can recover the noise power on the signal.
>>> np.mean(Pxx_den[25000:])
# 0.00099728892368242854

# Now compute and plot the power spectrum.
>>> f, Pxx_spec = signal.periodogram(x, fs, 'flattop', scaling='spectrum')
>>> plt.figure()
>>> plt.semilogy(f, np.sqrt(Pxx_spec))
>>> plt.ylim([1e-4, 1e1])
>>> plt.xlabel('frequency [Hz]')
>>> plt.ylabel('Linear spectrum [V RMS]')
>>> plt.show()
# The peak height in the power spectrum is an estimate of the RMS
# amplitude.
>>> np.sqrt(Pxx_spec.max())
# 2.0077340678640727

spectrogram

Compute a spectrogram with consecutive Fourier transforms.

Spectrograms can be used as a way of visualizing the change of a nonstationary signal’s frequency content over time.

源码:

def spectrogram(x, fs=1.0, window=('tukey', .25), nperseg=None, noverlap=None,
 nfft=None, detrend='constant', return_onesided=True,
 scaling='density', axis=-1, mode='psd'):
 """Compute a spectrogram with consecutive Fourier transforms."""

 modelist = ['psd', 'complex', 'magnitude', 'angle', 'phase']
 if mode not in modelist:
 raise ValueError('unknown value for mode {}, must be one of {}'
 .format(mode, modelist))

 # need to set default for nperseg before setting default for noverlap below
 window, nperseg = _triage_segments(window, nperseg,
 input_length=x.shape[axis])

 # Less overlap than welch, so samples are more statisically independent
 if noverlap is None:
 noverlap = nperseg // 8

 if mode == 'psd':
 freqs, time, Sxx = _spectral_helper(x, x, fs, window, nperseg,
 noverlap, nfft, detrend,
 return_onesided, scaling, axis,
 mode='psd')

 else:
 freqs, time, Sxx = _spectral_helper(x, x, fs, window, nperseg,
 noverlap, nfft, detrend,
 return_onesided, scaling, axis,
 mode='stft')

 if mode == 'magnitude':
 Sxx = np.abs(Sxx)
 elif mode in ['angle', 'phase']:
 Sxx = np.angle(Sxx)
 if mode == 'phase':
 # Sxx has one additional dimension for time strides
 if axis < 0:
 axis -= 1
 Sxx = np.unwrap(Sxx, axis=axis)

 # mode =='complex' is same as `stft`, doesn't need modification

 return freqs, time, Sxx

例子:

>>> from scipy import signal
>>> from scipy.fft import fftshift
>>> import matplotlib.pyplot as plt
# Generate a test signal, a 2 Vrms sine wave whose frequency is slowly
# modulated around 3kHz, corrupted by white noise of exponentially
# decreasing magnitude sampled at 10 kHz.
>>> fs = 10e3
>>> N = 1e5
>>> amp = 2 * np.sqrt(2)
>>> noise_power = 0.01 * fs / 2
>>> time = np.arange(N) / float(fs)
>>> mod = 500*np.cos(2*np.pi*0.25*time)
>>> carrier = amp * np.sin(2*np.pi*3e3*time + mod)
>>> noise = np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
>>> noise *= np.exp(-time/5)
>>> x = carrier + noise
# Compute and plot the spectrogram.
>>> f, t, Sxx = signal.spectrogram(x, fs)
>>> plt.pcolormesh(t, f, Sxx, shading='gouraud')
>>> plt.ylabel('Frequency [Hz]')
>>> plt.xlabel('Time [sec]')
>>> plt.show()

# Note, if using output that is not one sided, then use the following:
>>> f, t, Sxx = signal.spectrogram(x, fs, return_onesided=False)
>>> plt.pcolormesh(t, fftshift(f), fftshift(Sxx, axes=0), shading='gouraud')
>>> plt.ylabel('Frequency [Hz]')
>>> plt.xlabel('Time [sec]')
>>> plt.show()

istft

Perform the inverse Short Time Fourier transform (iSTFT).

这部分源码比较长,先直接看 stft 的例子,然后是 istft 的例子。

>>> from scipy import signal
>>> import matplotlib.pyplot as plt
# Generate a test signal, a 2 Vrms sine wave at 50Hz corrupted by
# 0.001 V**2/Hz of white noise sampled at 1024 Hz.
>>> fs = 1024
>>> N = 10*fs
>>> nperseg = 512
>>> amp = 2 * np.sqrt(2)
>>> noise_power = 0.001 * fs / 2
>>> time = np.arange(N) / float(fs)
>>> carrier = amp * np.sin(2*np.pi*50*time)
>>> noise = np.random.normal(scale=np.sqrt(noise_power),
... size=time.shape)
>>> x = carrier + noise
# Compute the STFT, and plot its magnitude
>>> f, t, Zxx = signal.stft(x, fs=fs, nperseg=nperseg)
>>> plt.figure()
>>> plt.pcolormesh(t, f, np.abs(Zxx), vmin=0, vmax=amp, shading='gouraud')
>>> plt.ylim([f[1], f[-1]])
>>> plt.title('STFT Magnitude')
>>> plt.ylabel('Frequency [Hz]')
>>> plt.xlabel('Time [sec]')
>>> plt.yscale('log')
>>> plt.show()

# Zero the components that are 10% or less of the carrier magnitude,
# then convert back to a time series via inverse STFT
>>> Zxx = np.where(np.abs(Zxx) >= amp/10, Zxx, 0)
>>> _, xrec = signal.istft(Zxx, fs)
# Compare the cleaned signal with the original and true carrier signals.
>>> plt.figure()
>>> plt.plot(time, x, time, xrec, time, carrier)
>>> plt.xlim([2, 2.1])
>>> plt.xlabel('Time [sec]')
>>> plt.ylabel('Signal')
>>> plt.legend(['Carrier + Noise', 'Filtered via STFT', 'True Carrier'])
>>> plt.show()

# Note that the cleaned signal does not start as abruptly as the original,
# since some of the coefficients of the transient were also removed:
>>> plt.figure()
>>> plt.plot(time, x, time, xrec, time, carrier)
>>> plt.xlim([0, 0.1])
>>> plt.xlabel('Time [sec]')
>>> plt.ylabel('Signal')
>>> plt.legend(['Carrier + Noise', 'Filtered via STFT', 'True Carrier'])
>>> plt.show()

check_COLA

Check whether the Constant OverLap Add (COLA) constraint is met

check_NOLA

Check whether the Nonzero Overlap Add (NOLA) constraint is met

lombscargle

Computes the Lomb-Scargle periodogram.

The Lomb-Scargle periodogram was developed by Lomb 2 and further extended by Scargle 3 to find, and test the significance of weak periodic signals with uneven temporal sampling.

When normalize is False (default) the computed periodogram is unnormalized, it takes the value (A**2) * N/4 for a harmonic signal with amplitude A for sufficiently large N.

When normalize is True the computed periodogram is normalized by the residuals of the data around a constant reference model (at zero).

Input arrays should be 1-D and will be cast to float64.

>>> import matplotlib.pyplot as plt
# First define some input parameters for the signal:
>>> A = 2.
>>> w = 1.
>>> phi = 0.5 * np.pi
>>> nin = 1000
>>> nout = 100000
>>> frac_points = 0.9 # Fraction of points to select
# Randomly select a fraction of an array with timesteps:
>>> r = np.random.rand(nin)
>>> x = np.linspace(0.01, 10*np.pi, nin)
>>> x = x[r >= frac_points]
# Plot a sine wave for the selected times:
>>> y = A * np.sin(w*x+phi)
# Define the array of frequencies for which to compute the periodogram:
>>> f = np.linspace(0.01, 10, nout)
# Calculate Lomb-Scargle periodogram:
>>> import scipy.signal as signal
>>> pgram = signal.lombscargle(x, y, f, normalize=True)
# Now make a plot of the input data:
>>> plt.subplot(2, 1, 1)
>>> plt.plot(x, y, 'b+')
# Then plot the normalized periodogram:
>>> plt.subplot(2, 1, 2)
>>> plt.plot(f, pgram)
>>> plt.show()


Appendix

_triage_segments

经过简化的后,可以通过如下的代码来了解频谱分析中是如何获取合适的窗函数的。在本文中,此函数会被引用自:

from scipy.signal.windows import get_window
def _triage_segments(window, nperseg, input_length):
 """
 Parses window and nperseg arguments for spectrogram and _spectral_helper.
 This is a helper function, not meant to be called externally.
 """
 # parse window; if array like, then set nperseg = win.shape
 if isinstance(window, str) or isinstance(window, tuple):
 # if nperseg not specified
 if nperseg is None:
 nperseg = 256 # then change to default
 if nperseg > input_length:
 warnings.warn('nperseg = {0:d} is greater than input length '
 ' = {1:d}, using nperseg = {1:d}'
 .format(nperseg, input_length))
 nperseg = input_length
 win = get_window(window, nperseg)
 else:
 win = np.asarray(window)
 if len(win.shape) != 1:
 raise ValueError('window must be 1-D')
 if input_length < win.shape[-1]:
 raise ValueError('window is longer than input signal')
 if nperseg is None:
 nperseg = win.shape[0]
 elif nperseg is not None:
 if nperseg != win.shape[0]:
 raise ValueError("value specified for nperseg is different"
 " from length of window")
 return win, nperseg

const_ext, even_ext, odd_ext, zero_ext

这四个边界延拓的辅助函数的源码,我贴在最后了。最重要的是,通过一下几个例子理解清楚,它们分别是什么意思即可。

  • const_ext: Constant extension at the boundaries of an array

    • Generate a new ndarray that is a constant extension of x along an axis. The extension repeats the values at the first and last element of the axis.

      >>> from scipy.signal._arraytools import const_ext
      >>> a = np.array([[1, 2, 3, 4, 5], [0, 1, 4, 9, 16]])
      >>> const_ext(a, 2)
      array([[ 1, 1, 1, 2, 3, 4, 5, 5, 5],
       [ 0, 0, 0, 1, 4, 9, 16, 16, 16]])
      # Constant extension continues with the same values as the endpoints of the
      # array:
      >>> t = np.linspace(0, 1.5, 100)
      >>> a = 0.9 * np.sin(2 * np.pi * t**2)
      >>> b = const_ext(a, 40)
      >>> import matplotlib.pyplot as plt
      >>> plt.plot(np.arange(-40, 140), b, 'b', lw=1, label='constant extension')
      >>> plt.plot(np.arange(100), a, 'r', lw=2, label='original')
      >>> plt.legend(loc='best')
      

  • even_ext: Even extension at the boundaries of an array

    • Generate a new ndarray by making an even extension of x along an axis.

      >>> from scipy.signal._arraytools import even_ext
      >>> a = np.array([[1, 2, 3, 4, 5], [0, 1, 4, 9, 16]])
      >>> even_ext(a, 2)
      array([[ 3, 2, 1, 2, 3, 4, 5, 4, 3],
       [ 4, 1, 0, 1, 4, 9, 16, 9, 4]])
      # Even extension is a "mirror image" at the boundaries of the original array:
      >>> t = np.linspace(0, 1.5, 100)
      >>> a = 0.9 * np.sin(2 * np.pi * t**2)
      >>> b = even_ext(a, 40)
      >>> import matplotlib.pyplot as plt
      >>> plt.plot(np.arange(-40, 140), b, 'b', lw=1, label='even extension')
      >>> plt.plot(np.arange(100), a, 'r', lw=2, label='original')
      >>> plt.legend(loc='best')
      >>> plt.show()
      

  • odd_ext: Odd extension at the boundaries of an array

    • Generate a new ndarray by making an odd extension of x along an axis.

      >>> from scipy.signal._arraytools import odd_ext
      >>> a = np.array([[1, 2, 3, 4, 5], [0, 1, 4, 9, 16]])
      >>> odd_ext(a, 2)
      array([[-1, 0, 1, 2, 3, 4, 5, 6, 7],
       [-4, -1, 0, 1, 4, 9, 16, 23, 28]])
      # Odd extension is a "180 degree rotation" at the endpoints of the original array:
      >>> t = np.linspace(0, 1.5, 100)
      >>> a = 0.9 * np.sin(2 * np.pi * t**2)
      >>> b = odd_ext(a, 40)
      >>> import matplotlib.pyplot as plt
      >>> plt.plot(np.arange(-40, 140), b, 'b', lw=1, label='odd extension')
      >>> plt.plot(np.arange(100), a, 'r', lw=2, label='original')
      >>> plt.legend(loc='best')
      >>> plt.show()
      

  • zero_ext: Zero padding at the boundaries of an array

    • Generate a new ndarray that is a zero-padded extension of x along an axis.

      >>> from scipy.signal._arraytools import zero_ext
      >>> a = np.array([[1, 2, 3, 4, 5], [0, 1, 4, 9, 16]])
      >>> zero_ext(a, 2)
      array([[ 0, 0, 1, 2, 3, 4, 5, 0, 0],
       [ 0, 0, 0, 1, 4, 9, 16, 0, 0]])
      
  • 简化后的操作源码:

    • From https://github.com/scipy/scipy/blob/master/scipy/signal/_arraytools.py

      from scipy.signal._arraytools import axis_slice
      def const_ext(x, n, axis=-1):
       """
       Constant extension at the boundaries of an array
       """
       if n < 1:
       return x
       left_end = axis_slice(x, start=0, stop=1, axis=axis)
       ones_shape = [1] * x.ndim
       ones_shape[axis] = n
       ones = np.ones(ones_shape, dtype=x.dtype)
       left_ext = ones * left_end
       right_end = axis_slice(x, start=-1, axis=axis)
       right_ext = ones * right_end
       ext = np.concatenate((left_ext,
       x,
       right_ext),
       axis=axis)
       return ext
      ##########################################################################
      def even_ext(x, n, axis=-1):
       """
       Even extension at the boundaries of an array
       """
       if n < 1:
       return x
       if n > x.shape[axis] - 1:
       raise ValueError(("The extension length n (%d) is too big. " +
       "It must not exceed x.shape[axis]-1, which is %d.")
       % (n, x.shape[axis] - 1))
       left_ext = axis_slice(x, start=n, stop=0, step=-1, axis=axis)
       right_ext = axis_slice(x, start=-2, stop=-(n + 2), step=-1, axis=axis)
       ext = np.concatenate((left_ext,
       x,
       right_ext),
       axis=axis)
       return ext
      ##########################################################################
      def odd_ext(x, n, axis=-1):
       """
       Odd extension at the boundaries of an array
       """
       if n < 1:
       return x
       if n > x.shape[axis] - 1:
       raise ValueError(("The extension length n (%d) is too big. " +
       "It must not exceed x.shape[axis]-1, which is %d.")
       % (n, x.shape[axis] - 1))
       left_end = axis_slice(x, start=0, stop=1, axis=axis)
       left_ext = axis_slice(x, start=n, stop=0, step=-1, axis=axis)
       right_end = axis_slice(x, start=-1, axis=axis)
       right_ext = axis_slice(x, start=-2, stop=-(n + 2), step=-1, axis=axis)
       ext = np.concatenate((2 * left_end - left_ext,
       x,
       2 * right_end - right_ext),
       axis=axis)
       return ext
      ##########################################################################
      def zero_ext(x, n, axis=-1):
       """
       Zero padding at the boundaries of an array
       """
       if n < 1:
       return x
       zeros_shape = list(x.shape)
       zeros_shape[axis] = n
       zeros = np.zeros(zeros_shape, dtype=x.dtype)
       ext = np.concatenate((zeros, x, zeros), axis=axis)
       return ext
      

  1. P. Welch, “The use of the fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms”, IEEE Trans. Audio Electroacoust. vol. 15, pp. 70-73, 1967. ↩︎

  2. N.R. Lomb “Least-squares frequency analysis of unequally spaced data”, Astrophysics and Space Science, vol 39, pp. 447-462, 1976 ↩︎

  3. J.D. Scargle “Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data”, The Astrophysical Journal, vol 263, pp. 835-853, 1982 例子: ↩︎

Python 中负数取余问题

2021年1月8日 08:00

最近发现在 Scipy 信号处理的原代码中,可以利用对负数取余的便利操作,进一步优化和清晰我们数据处理的过程。

“The % symbol in Python is called the Modulo Operator. It returns the remainder of dividing the left hand operand by right-hand operand. It’s used to get the remainder of a division problem.” — freeCodeCamp

官方 Python 文档中对 % 或者完全等价的 operator.mod() (需要import operator) 介绍不清楚,可以参考下面三个材料,写的非常好!

上面,是我搜集了 3 个与 % 相关的讲的非常详细细致的教程帖子。下面,我直接简明扼要的介绍这个 “负数取余” 的 trick。


例子:

12 % 5, -12 % 5
# output
# (2, 3)

这是为什么呢?

在数学里,“负数取余"遵循的是:

如果 ad 是整数,d 非零,那么余数 r 满足 a = q * d + r, q 为整数,且 0 <= |r| < |d|

由此可见,我们的被除数 a = 12, 我们的商 d = 5,那么有两个余 r 满足条件,分别是一个负的余数 r1 = -2 和正的余数 r2 = 3,并且总有规律 r1 + r2 = d

在计算机语言中,同号的整数运算,所有语言都遵循尽量让商小的原则,所以 12 mod 5-12 mod -5 是一样的方式,结果差一个符号,分别是 2-2。但是在异号的整数运算中,CJava 都是尽可能让商 d 更大 1(例如 -12 mod 5 的结果对应的是商 d = -2,余 r = -2),而 Python 则是会让商尽可能的小(例如 -12 mod 5 的结果对应的是商 d = -3,余 r = 3)。


最近,在我阅读 scipy/signal/spectral.py源代码时,看到在数据处理中使用“负数取余”可以写出更加简洁和清晰的代码。由此,进而可以给出如下的一种更好的理解方式:

还是 -12 mod 5 这个例子:

所以,有时候我们为了对时序数据计算 padding 等问题的时候,可以考虑令其数据长度为负,再整除以 step 后,来计算“余出”部分的数据长度,而不是“欠余”部分的数据长度。

恒 Q 变换 (Constant-Q transform)

2020年12月21日 08:00
  • 此文的内容主要基于一篇博文和一篇 paper 的第一、二章内容进一步加工和汇总总结的。
  • 同时,我还将 paper 中的 CQT MATLAB 版本代码都翻译到了 Python 版本。

由于在音乐中,所有的音都是由若干八度的 12 平均律共同组成的,这十二平均律对应着钢琴中一个八度上的十二个半音。这些半音临近之间频率比为 $2^{1/12}$。显然,同一音级的两个八度音,高八度音是低八度音频率的两倍。比方说,西方音乐的基频 (F0s) 可以如下定义:

$$ F_k = 440\text{Hz} \times 2^{k/12} $$ 其中,$k\in[-50,40]$ 是一个整数。

因此在音乐当中,声音都是以指数分布的,但我们的傅立叶变换得到的音频谱都是线性分布的,两者的频率点是不能一一对应的,这会指使某些音阶频率的估计值产生误差。所以现代对音乐声音的分析,一般都采用一种具有相同指数分布规律的时频变换算法—— CQT (Constant-Q transform)。

音频处理中的恒定Q变换,主要的应用在声源分离、音频谱分析及音频效果上。(Constant-Q transform in music processing. It is useful for several applications, including sound source separation, music signal analysis, and audio effects.)


什么是恒 Q 变换(CQT)

CQT 指中心频率按指数规律分布,滤波带宽不同、但中心频率与带宽比为常量 Q 的滤波器组。它与傅立叶变换不同的是,它频谱的横轴频率不是线性的,而是基于 $\log2$ 为底的,并且可以根据谱线频率的不同该改变滤波窗长度,以获得更好的性能。由于 CQT 与音阶频率的分布相同,所以通过计算音乐信号的 CQT 谱,可以直接得到音乐信号在各音符频率处的振幅值,对于音乐的信号处理来说简直完美。

CQT 变换没有得以广泛的得到应用有 4 个主要原因:

  1. 与 DFT 相比,它的计算效率是不行的;
  2. 缺少能够实现完美信号重构的逆 CQT 变换;
  3. CQT 的时频图像所表示的矩阵数据结构比短时傅里叶变换相比处理起来有难度;
  4. CQT 中,某确定的时间分辨率变化下会要计算不同的频域变换范围,换句话说要实现不同频率范围下的不同步采样。

关于信号处理中的窗口

在实际的信号处理过程中,我们会将时间片段分帧,按照帧为单位,转换成一个基于时间帧的频谱图,然后我们再将这些频谱图放到时间轴上,就可以形成一个类似热力图样的,基于时间变换的频谱变换图。

基于时间变换的频谱变换图
基于时间变换的频谱变换图

现在我们注意一个问题,在我们的单个时间帧内(实际上我们在对信号进行截断),信号很可能不呈现它原周期的整数倍(周期截断),那么截断后的信号会泄漏问题,为了更好地满足傅立叶变换(实际上是FFT)处理的周期性要求,我们需要使用加权函数,也叫窗函数 (windown function)。加窗主要是为了使时域信号似乎更好地满足FFT处理的周期性要求,减少泄漏。

我们关注上述“中心频率与带宽比为常量 Q”,从公式上看,我们可以表达为下述公式:

$$ Q = \frac{f_k}{\Delta f_k} . $$ 其中,filter width, $\Delta f_k$; $Q$, the “quality factor”.

直观的理解是,恒Q变换避免了时频分辨率均匀的缺点,对于低频的波,它的带宽十分小,但有更高的频率分辨率来分解相近的音符;但是对于高频的波,它的带宽比较大,在高频有更高的时间分辨率来跟踪快速变化的泛音。

CQT 可以看作是一个小波变换。

CQT 变换 $X^\text{QT}(k, n)$ 关于离散时域信号 $x(n)$ 可以定义为:

$$ X^{\mathrm{CQ}}(k, n)=\sum_{j=n-\left\lfloor N_{k} / 2\right\rfloor}^{n+\left\lfloor N_{k} / 2\right\rfloor} x(j) a_{k}^{*}\left(j-n+N_{k} / 2\right) $$

其中,$k=1,2,\dots,K$ 表示的 CQT 变换的频域 bins,$\lfloor\cdot\rfloor$ 表示的是向下取整,$a^*_k(n)$ 表示的是对 $a_k(n)$ 的复共轭。可以看到,上面公式中的基函数 $a_k(n)$ 是一个复数域的波形,通常叫做时频原子 (atom),定义为:

$$ a_{k}(n)=\frac{1}{N_{k}} w\left(\frac{n}{N_{k}}\right) \exp \left[-i 2 \pi n \frac{f_{k}}{f_{\mathrm{s}}}\right] $$

其中,$f_k$ 就是第 $k$ 个 bin 的中心频率,$f_s$ 是采样率,而 $w(t)$ 是连续的窗函数 (比方说汉宁窗或者布莱克曼窗),窗函数在 $t\in[0,1]$ 以外恒为零。上面两个公式中的窗口跨度 $N_k\in\mathbb{R}$ 是与 $f_k$ 成反比的实数,即在所有的 bins $k$ 上为恒定的常量 Q。

中心频率 $f_k$ 满足关系:

$$ f_{k}=f_{1} 2^{\frac{k-1}{B}} $$

其中,$f_1$ 就是最低频率 bin 里的中心频率,而 $B$ 是每一个八度 (octave) 中所取的 bins 数目。显然,$B$ 这个参数是一个 CQT 中很重要的超参数,因为它决定了 CQT 在时频域上的分辨率。

每一个 bin $k$ 所对应的常量 Q 为:

$$ Q = Q_{k} \stackrel{\text { def. }}{=} \frac{f_{k}}{\Delta f_{k}}=\frac{N_{k} f_{k}}{\Delta \omega f_{s}} $$

其中,$\Delta f_{k}$ 是 atom $a_k(n)$ 的频域带宽相应,$\Delta \omega$ 是窗函数 $w(t)$ 的能谱带宽。

我们当然希望让 Q 可以越大越好,这样的话每个 bin 的带宽 $\Delta f_k$ 就可以越来越小。但是 Q 并不能任意的大,否则 bins 之间的部分能谱就会没法有效的分析。所以,给定 Q 值后,也就意味着给定的能够重构波形的最小频率带宽:

$$ Q=\frac{q}{\Delta \omega\left(2^{\frac{1}{B}}-1\right)} $$

其中,$0<q \lesssim 1$ 是一个缩放因子,通常来说我们取 $q=1$。$q$ 越小也就意味着提高了时间分辨率,而降低频率分辨率。需要注意的是,比方说 $q=0.5$ 和 $B=48$ 与 $q=1$ 和 $B=24$ 是等价的,而前者在每个八度中会有两倍的频域 bins 数目。也就是说,$q<1$ 意味着在频域进行了过采样 (oversampling),类似于通过零填充来计算 DFT 一样。比方说,$q=0.5$ 对应于 2 倍的过采样,这在每个八度范围中取了一个等效的 $B/2$ 个 bins 的频域分辨率,尽管每个八度里确实分了 $B$ 个 bins。

上面两个公式合在一起后,就可以给出 $N_k$:

$$ N_{k}=\frac{q f_{\mathrm{s}}}{f_{k}\left(2^{\frac{1}{B}}-1\right)} $$

这个公式很重要,可以看到公式中没有了 $\Delta \omega$。


Codes

对输入信号的每一个采样点 $n$ 都去计算其 CQT 变换系数 $X^\text{CQ}(k,n)$ 是难以实现的。Constant-Q transform toolbox for music processing 这个文章就是解决了这个问题。文章作者谈到说,他们不仅可以让 CQT 的运算效率很好的提升(通过稀疏矩阵),还可以给出精确度很可观的 CQT 逆变换(通过上下采样来调节频段)。

作者提供的 MATLAB 代码可以在下面 👇 的链接中找到(原文中的链接已经失效):

以下摘自 Repo:

A Python/MATLAB reference implementation of a computationally efficient method for computing the constant-Q transform (CQT) of a time-domain signal.

Note: I just translate the core original MATLAB codes (/MATLAB) to Python version (/CQT.py) with following functions:

  • Core:

    • cqt
    • icqt
    • genCQTkernel
    • getCQT
    • cell2sparse
    • sparse2cell
    • plotCQT
  • Extra bonus:

    • buffer
    • upsample
    • round_half_up
    • nextpow2
    • hann

See the authors’ homepage for more information and MATLAB packaged downloads:

Requirements

  • Python 3.6+
  • Numpy
  • Scipy
  • Matplotlib

Demo

Note: It might not be as efficient than the original MATLAB version, partly because the sparse property have yet to be fully utilised in this Python version.

from CQT import *
fname = './demo.dat'
data = np.loadtxt(fname)
t, hp, hc = data[:,0], data[:,1], data[:,2]

fs = 1/(t[1]-t[0])
print('fs =', fs)

bins_per_octave = 24
fmax = 400
fmin = 20

Xcqt = cqt(hp, fmin, fmax, bins_per_octave, fs,)
_ = plotCQT(Xcqt, fs, 0.6)

y = icqt(Xcqt)

References

Python 装饰器之 Property: Setter 和 Getter

2020年11月25日 08:00

Getters(also known as ‘accessors’) and setters (aka. ‘mutators’) are used in many object oriented programming languages to ensure the principle of data encapsulation. Data encapsulation - as you can learn in a introduction on Object Oriented Programming of the tutorial - is seen as the bundling of data with the methods that operate on them. These methods are of course the getter for retrieving the data and the setter for changing the data. According to this principle, the attributes of a class are made private to hide and protect them from the other codes.

Unfortunately, it is widespread belief that a proper Python class should encapsulate private attributes by using getters and setters. As soon as one of these programmers introduces a new attribute, he or she will make it a private variable and creates “automatically” a getter and a setter for this attributes. Such programmers may even use an editor or an IDE, which automatically creates getters and setters for all private attributes. These tools even warn the programmer if she or he uses a public attribute! Java programmers will wrinkle their brows, screw up their noses, or even scream with horror when they read the following: The Pythonic way to introduce attributes is to make them public.

Source: Properties vs. Getters and Setters

Using @property decorators to achieve getters and setters behaviour.


Demo

用一个简单例子来开局,体会一般:

class Person:
 def __init__(self, name):
 self.name1 = name
 self.name2 = '小白'

 # 利用property装饰器将获取name方法转换为获取对象的属性
 @property
 def name(self):
 return self.name1 + '!'

 # 利用property装饰器将设置name方法转换为获取对象的属性
 @name.setter # @属性名.setter
 def name3(self, n):
 self.name1 = '小绿' if n == '小灰' else '小宝'


p = Person('小黑')
print(p.name, p.name1, p.name2, p.name3)
p.name3 = '小灰'
print(p.name, p.name1, p.name2, p.name3)
p.name3 = '小2'
print(p.name, p.name1, p.name2, p.name3)
p.name = '123'

Output:

小黑! 小黑 小白 小黑!
小绿! 小绿 小白 小绿!
小宝! 小宝 小白 小宝!
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-110-90af6f048a4f> in <module>
----> 1 p.name = '123'

AttributeError: can't set attribute

上图中的例子,我们可以直观的感受到 @property 装饰器将调用方法改为了调用对象,即 p.name() 改为了 p.name。另外,@name.setter 装饰器不仅将调用方法改为了获取指定对象的属性,即 p.name3 对应于 p.name()p.name。此外,对其赋值时相当于调用了方法,即有 p.name3 = n 对应于 p.name3(n)

值得留意的是,上述例子背后其实是在操作私有属性 p.name,使用者是透过 setter 方法来管理输入的值,并对 p.name 等属性参数进行赋值影响,直接对私有属性 p.name 进行赋值是不被允许的。

了解了其背后的执行逻辑和规律以后,下面给几个标准写法和实例:


Case 1

我们可以用其来对属性的赋值做判断和异常检测。

# Python program showing the use of 
# @property from https://www.geeksforgeeks.org/getter-and-setter-in-python/

class Geeks:
 def __init__(self):
 self._age = 0

 # using property decorator 
 # a getter function 
 @property
 def age(self):
 print("getter method called")
 return self._age

 # a setter function 
 @age.setter
 def age(self, a):
 if(a < 18):
 raise ValueError("Sorry you age is below eligibility criteria")
 print("setter method called")
 self._age = a

Case 2

另一种写法就是可以将 settergetter 作为私有方法隐藏起来:

# https://www.datacamp.com/community/tutorials/property-getters-setters
class FinalClass:

 def __init__(self, var):
 ## calling the set_a() method to set the value 'a' by checking certain conditions
 self.__set_a(var)

 ## getter method to get the properties using an object
 def __get_a(self):
 return self.__a

 ## setter method to change the value 'a' using an object
 def __set_a(self, var):

 ## condition to check whether var is suitable or not
 if var > 0 and var % 2 == 0:
 self.__a = var
 else:
 self.__a = 2

 a = property(__get_a, __set_a)

Case 3

这个例子来自 stackoverflow 上的回答,可以参考其是如何避免 delete 受保护的属性。

# https://stackoverflow.com/a/36943813/8656360
class Protective(object):
 """protected property demo"""
 #
 def __init__(self, start_protected_value=0):
 self.protected_value = start_protected_value
 # 
 @property
 def protected_value(self):
 return self._protected_value
 #
 @protected_value.setter
 def protected_value(self, value):
 if value != int(value):
 raise TypeError("protected_value must be an integer")
 if 0 <= value <= 100:
 self._protected_value = int(value)
 else:
 raise ValueError("protected_value must be " +
 "between 0 and 100 inclusive")
 #
 @protected_value.deleter
 def protected_value(self):
 raise AttributeError("do not delete, protected_value can be set to 0")

Output:

>>> p1 = Protective(3)
>>> p1.protected_value
3
>>> p1 = Protective(5.0)
>>> p1.protected_value
5
>>> p2 = Protective(-5)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 3, in __init__
 File "<stdin>", line 15, in protected_value
ValueError: protectected_value must be between 0 and 100 inclusive
>>> p1.protected_value = 7.3
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 17, in protected_value
TypeError: protected_value must be an integer
>>> p1.protected_value = 101
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 15, in protected_value
ValueError: protectected_value must be between 0 and 100 inclusive
>>> del p1.protected_value
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 18, in protected_value
AttributeError: do not delete, protected_value can be set to 0

Case 4

最后一个例子非常有趣,发现可以利用 Property 下的 Setter 和 Getter 决定多种属性之间的动态依赖关系

FYI:在信号处理中,时域下的采样率 sampling_rate,时长 time_duration 和采样点总数 Nt 三个变量中是任意两个可以推导出第三个变量。

# https://github.com/stephengreen/lfi-gw/blob/11cd4f650af793db45ebc78892443cc2b0b60f40/lfigw/waveform_generator.py#L250
class DSP():
 def __init__(self,):
 self.sampling_rate = 1024
 self.time_duration = 8
 @property
 def f_max(self):
 """Set the maximum frequency to half the sampling rate."""
 return self.sampling_rate / 2.0

 @f_max.setter
 def f_max(self, f_max):
 self.sampling_rate = 2.0 * f_max

 @property
 def delta_t(self):
 return 1.0 / self.sampling_rate

 @delta_t.setter
 def delta_t(self, delta_t):
 self.sampling_rate = 1.0 / delta_t

 @property
 def delta_f(self):
 return 1.0 / self.time_duration

 @delta_f.setter
 def delta_f(self, delta_f):
 self.time_duration = 1.0 / delta_f

 @property
 def Nt(self):
 return int(self.time_duration * self.sampling_rate)

 @property
 def Nf(self):
 return int(self.f_max / self.delta_f) + 1
def OUTPUT():
print('-'*20+'''\nsampling_rate: {}
time_duration: {}
f_max: {}
delta_t: {}
delta_f: {}
Nt: {}
Nf: {}\n'''.format(t.sampling_rate, t.time_duration, t.f_max, t.delta_t, t.delta_f, t.Nt, t.Nf)+'-'*20)

Output:

>>> t = DSP()
>>> OUTPUT()
--------------------
sampling_rate: 1024
time_duration: 8
f_max: 512.0
delta_t: 0.0009765625
delta_f: 0.125
Nt: 8192
Nf: 4097
--------------------
>>>t.f_max = 256
>>>OUTPUT()
--------------------
sampling_rate: 512.0
time_duration: 8
f_max: 256.0
delta_t: 0.001953125
delta_f: 0.125
Nt: 4096
Nf: 2049
--------------------
>>>t.delta_t = 1/8192
>>>OUTPUT()
--------------------
sampling_rate: 8192.0
time_duration: 8
f_max: 4096.0
delta_t: 0.0001220703125
delta_f: 0.125
Nt: 65536
Nf: 32769
--------------------

Reference

此文探讨非常浅显,仅简明扼要,更多的讨论素材和教程细节,可参阅下方的参考文献:

Unit 2: Verbs(学术写作)

2020年11月24日 08:00

这是一门来自埃米编辑 (AiMi Editor)的《SCI 论文写作视频课程》的详细学习笔记。该公开课程的原英文名称是《Writing in the sciences》 on Coursera,共六个单元,由来自斯坦福大学 (Stanford University) 的 Dr. Kristin Sainani 老师主讲。

About the course:

This course teaches scientists to become more effective writers, using practical examples and exercises. Topics include: principles of good writing, tricks for writing faster and with less anxiety, the format of a scientific manuscript, peer review, grant writing, ethical issues in scientific publication, and writing for general audiences.


上一个单元讲了有效写作的三个关键原则,并且着重谈到了第一个原则是如何从你的草稿中删掉杂乱的部分,本单元将会谈后两个原则,都与动词有关。

教你如何在写作中使用主动语态,以及如何用动词写作,这意味着要使用强动词,避免把动词变成名称,保持句子的主题,主谓语动词在句子中首尾紧密相连。

2.1 Use the active voice

要用主动语态

What is the active voice?

究竟什么是主动语态呢?

主动语态的格式为主语+动词+宾语(subject+verb+object),就和平常说话是一样的,比方说:

She throws the ball.

Martha will drive the car.

The president made mistakes.

主语是动作的“代理人”,而动作的“接受者”就是宾语。被动语态会使结构倒转,如下:

The ball is thrown by her.

The car will be driven by Matha.

Mistakes were made by the President.

Recognizing a passive verb

如何识别被动语态呢?

  • Passive verb = a form of the verb “to be” + the past participle of the main verb
  • The main verb must be a transitive verb (that is, take an object).

这个部分很简单,不用多说~

  • “to be” verbs:
    Is could be
    Are shall be
    Was should be
    Were will be
    Be would be
    Been may be
    Am might be
    must be
    has been

Example: passive voice

My first visit to Boston will always be remembered by me.

  • My first visit to Boston: Recipient of the action
  • remembered: Verb
  • me: Agent of the action

Active:

I will always remember my first visit to Boston.

再换个例子:

She is loved.

  • She: The recipient of the love.

  • is: Form of “to be”

  • loved: Past participle of a transitive verb: to love (direct object).

  • Example: passive voice

    Cigarette ads were designed to appeal especially to children.

    Active:

    We designed the cigarette ads to appeal especially to children.

    • We: Responsible party!

Passive vs. active voice

要把被动语态转换为主动语态,不妨尝试问自己:

“Who does what to whom?”

谁对谁做了什么?

要搞清楚行为是谁做的,行为又作用在了谁上?

  • Use active voice

    Passive:

    By applying a high resolution, 90 degree bending magnet downstream of the laser electron interaction region, the spectrum of the electron beams could be observed.

    我们知道上面例子中下划线处是观察到了什么,但不知道是谁做的。我们需要加个代理人作主语,表示谁做的观察。比方说代理人就是这句话的作者。可以改成如下句子:

    Active:

    We could observe the spectrum of the electron beams by applying a high resolution, 90 degree bending magnet downstream of the laser electron interaction region.

    再来个例子:

    Passive:

    Increased promoter occupancy and transcriptional activation of $\mathrm{p} 21$ and other target genes were observed.

    Active: We observed increased promoter occupancy and transcriptional activation of $\mathrm{p} 21$ and other target genes.

    再再来个例子:

    Passive: The activation of Ca++ channels is induced by the depletion of endoplasmic reticulum Ca++ stores.

    钙通道的激活是有内质网钙储存耗尽引起的。

    Active: Depleting Ca++ from the endoplasmic reticulum activates Ca++ channels.

    激活消耗血浆网末端的钙激活钙通道。

    可以看到,上面的例子同时也对句子中的词进行了缩减,去掉的多余的单词。

    再再来个例子:

    Passive:

    Additionally, it was found that pre-treatment with antibiotics increased the number of super-shedders, while immunosuppression did not.

    Active:

    Pre-treating the mice with antibiotics increased the number of super-shedders while immunosuppresion did not.

    用抗生素处理小鼠增加了超脱壳的数量,而免疫抑制没有。

Advanteges of the active voice

使用主动语态的三个关键原因:

  1. Emphasizes author responsibility

    强调作者责任

  2. Improves readability

    提高可读性

  3. Reduces ambiguity

    减少歧义

下面我们分别举例子来说明:

  • Emphasizes author responsibility

    No attempt was made to contact nonresponders because they were deemed unimportant to the analysis. (passive)

    We did not attempt to contact nonresponders because we deemed them unimportant to the analysis. (active)

    我们没有试图联系非响应者,因为我们认为它们对分析不重要。

  • Improves readability

    A strong correlation was found between use of the passive voice and other sins of writing. (passive)

    We found a strong correlation between use of the passive voice and other sins of writing. (active)

    我们发现被动语态的使用与写作的其他罪恶之间有很强的相关性。

    Use of the passive voice strongly correlated with other sins of writing. (active)

    被动语态与写作中的其他罪过密切相关。

  • Reduces ambiguity

    General dysfunction of the immune system at the leukocyte level is suggested by both animal and human studies. (passive)

    免疫系统的一般功能障碍,在白细胞水清由动物和人类研究提出。

    这里不清楚到底谁有免疫功能障碍。不得不加个词:“糖尿病”:

    Both human and animal studies suggest that diabetics have general immune dysfunction at the leukocyte level. (active)

    人类和动物的研究都表明糖尿病患者在白细胞水平上有普遍的免疫功能障碍。

Is it ever OK to use the passive voice?

被动语态难道就该斩尽杀绝嘛?当然不!

  • Yes! The passive voice exists in the English language for a reason. Just use it sparingly and purposefully.

    要想用被动语态,一定是有一个好的理由的,不应该只是出于习惯而使用它。你需要有目的地谨慎地使用它。

    • For example, passive voice may be appropriate in the methods section where what was done is more important than who did it.

      使用被动语态可以在一篇论文的方法论部分中。在方法部分中,做了什么行为,该动词的接受者比谁做更重要。被动语态在这一部分用起来会很好,因为它强调了完成的部分。

      另外,方法部分通常比较冗长,不那么生动,看得很细的人并不多。

  • 总之,Sainani 老师强烈建议:仅方法部分可以用被动语态,但是引言、讨论和总结部分,一定要用主动语态。


2.2 Is it really OK to use “We” and “I”?

使用“人称代词”到底行么?

Yes, It’s OK!

当然可以!有如下三个原因:

  1. The active voice is livelier and easier to read.

    为了使用主动语态,你通常得用到我或我们,这样主动语态更加鲜活,而且更容易阅读。是一种更清晰更吸引人的写作方式。

  2. Avoiding personal pronouns does not make your science more objective.

    去掉人称代词并不会使你的写作体现出更多的客观性。

  3. By agreeing to be an author on the paper, you are taking responsibility for its content. Thus, you should also claim respónsibility for the assertions in the text by using “we” or “I.”

    当你把自己的名字写在文章上的时候,你要对它的内容承担公众责任。所以,你应该铜鼓哦使用“我们”或“我”来主动要求为自己的断言(claim)承担责任。

Avoiding personal pronouns does not led objectivity

一些迷思:

You/your team designed, conducted, and interpreted the experiments. To imply otherwise is misleading.

你和你的团队设计、指导和解释了这些实验,以一种实验已经发生的方式进行写作是一种误导。

The experiments and analysis did not materialize out of thin air!

实验和分析并不会凭空实现的。

The goal is to be more objective, not to appear more objective.

目标是更加客观的呈现事实,而不是使科学更加主观。

Sainani 老师开始持续输出价值观:

“After all, human agents are responsible for designing experiments, and they are present in the laboratory; writing awkward phrases to avoid admitting their responsibility and their presence is an odd way of being objective.”

Jane J. Robinson, Science 7 June 1957: 1160 .

好吧,即使你不相信上述的原因,这里有一个非常实际原因:期刊编辑要求你这样做!

Journals want this!

The style guidelines for many journals explicitly instruct authors to write in the active voice.

期刊编辑认识到主动语态更容易阅读,像是读日记的体验。所以许多期刊的风格指南都明确地告诉你要用主动语态写作。

For example, Science magazine advises:

“Use active voice when suitable, particularly when necessary for correct syntax (e.g., “To address this possibility, we constructed a $\lambda$Zap library …).”

在适当的时候, 特别是必要的时候,请在正确的语法下使用主动语态。

http://www.sciencemag.org/site/feature/contribinfo/prep/res/style.xhtml

Great authors use “we” and “I”!

Watson and Crick’s celebrated 1953 paper in Nature begins:

We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.).”

http://www.exploratorium.edu/origins/coldsprinq/printit.htm

上面这篇论文本身在写作方面也是相当引人注目,有空的话要读读,注意一下他们的写作风格和技巧。


2.3 Active voice practice

Passive:

A recommendation was made by the DSMB committee that the study be halted.

Active

The DSMB committee recommended that the study be halted.

Passive:

Major differences in the reaction times of the two study subjects were found.

Active:

We observed major differences in the reaction times of the two study subjects.

The two study subjects differed in reaction times.

Passive:

It was concluded by the editors that the data had been falsified by the authors.

Active:

The editors concluded that the authors falsified their data.

Passive:

The first visible-light snapshot of a planet circling another star has been taken by NASA’s Hubble Space Telescope.

Active:

NASA’s Hubble Space Telescope has taken the first visible-light snapshot of a planet circling another star.

Passive:

Therefore, the hypothesis that the overall kinetics of a double transtibial amputee athlete and ana able-bodied sprinter at the same level of performance are not differnet was rejected.

Therefore, we rejected the hypothesis that the overall kinetics of a double transtibial amputee athlete and an able-bodied sprinter at the same level of performance are comparable.


2.4 Write with verbs

关于用动词来写作,有如下三个主要原则:

  • use strong verbs

    要使用强动词

  • avoid turning verbs into nouns

    避免把动词变成名词

  • don’t bury the main verb

    避免隐藏主动词

下面,我们来分别谈谈看:

Use strong verbs

Verbs make sentences go!

动词使句子通顺!

来举个例子:

“Loud music came from speakers embedded in the walls, and the entire arena moved as the hungry crowd got to its feet.”

响亮的音乐来自嵌在墙上的扬声器,当饥饿的人群站起来时,整个竞技场都在移动。

多好的句子,它很有描述性,不断前进的吸引着读者。

再看看下面这个原版的句子:

“Loud music exploded from speakers embedded in the walls, and the entire arena shook as the hungry crowd leaped to its feet.”

Bringing Down the House, Ben Mezrich

你可以看到那些富有表现力的活拨的伟大动词是如何使句子如此生动的。它把读者吸引进来,它使句子栩栩如生。当然,学术写作未必会有机会用到这种呈现夸张的词,但是我们也可以用动词表现得比用无聊的动词更好。

  • Pick the right verb!

    要用对词!

    举个栗子:

    The WHO reports that approximately two-thirds of the world’s diabetics are found in developing countries, and estimates that the number of diabetics in these countries will double in the next 25 year.

    世卫组织报告说大约三分之二世界上的糖尿病患者在有发展中国家被发现,并且估计这些国家的糖尿病患者人数将在未来 25 年翻一番。

    上面句子还挺不错的,两个动词用的中规中矩,其中report 这个词后面还有个副词 approximately,这也是应该避免的。下面可以改得更好:

    The WHO estimates that two-thirds of the world’s diabetics are found in developing countries, and projects that the number of diabetics in these countries will double in the next 25 years.

    project 这个词既可以避免上一个动词的重复,也可以很好的描绘对未来的估计。

    Sainani 老师建议可以使用同义词库来帮助你找到可以替换的动词。

  • Use “to be” verbs purposefully and sparingly.

    Is, are, was, were, be, been ,am …

    上面的助动词在科学写作中被过度使用,而且很无聊。有时你必须用助动词时,你不用回避它们,但它们也不应该是你论文中的主要动词。

Don’t turn verbs into nouns

  • Don’t kill verbs by turning them into nouns.

    把动词变成名词的问题,这是一个在学术写作中根深蒂固的坏习惯。

举个栗子:

During DNA damage, recognition of H3K4me3 by ING2 results in recruitment of Sin3/HDAC and repression of cell proliferation genes.

上面句子中的下划线名词都会拖累读者的阅读速度和阅读体验,让读者很难跟踪发生的事情。但这句不好改,因为没有足够的信息来正确地编辑。补充信息后,可以如下修改:

During DNA damage, H3K4me3 recruits ING2 and sin3/HDAC, which together repress cell proliferation .

由此可见,再一次说明了要确切的说清楚:谁对谁做了什么

Say exactly who does what to whom!

在学术写作中,我们经常做的另一件事是:我们想到了一个很好的时髦动词,却把它变成了一个无聊的名词,并配上一个无聊的动词,比如下面的例子:

Obtain estimates of $\rightarrow$ estimate

Has seen an expansion in $\rightarrow$ has expanded

Provides a methodologic emphasis $\rightarrow$ emphasizes methodology

Take an assessment of $\rightarrow$ assess

Privede a review of $\rightarrow$ review

Offer confirmation of $\rightarrow$ confirm

Make a decision $\rightarrow$ decide

Shows a peak $\rightarrow$ peaks

Provides a description of $\rightarrow$ describe

Don’t bury the main verb

不要把主谓语动词掩盖住

Keep the subject and main verb (predicate) close together at the start of the sentence…

要确保主语的主动词在句首附近。

  • Readers are waiting for the verb!

    读者在等待你的动词啊!

The case of the buried predicate…. 来个糟糕的例子体会下:

One study of 930 adults with multiple sclerosis (MS) receiving care in one of two managed care settings or in a fee-for-service setting found that only two-thirds of those needing to contact a neurologist for an MSrelated problem in the prior 6 months had done so (Vickrey et al 1999).

可以如下解决上面这个句子:

One study found that, of 930 adults with multiple sclerosis (MS) who were receiving care in one of two managed care settings or in a fee-for-service setting, only two-thirds of those needing to contact a neurologist for an MS-related problem in the prior six months had done so (Vickrey et al 1999).


2.5 Practice examples

来看例句:

The fear expressed by some teachers that students would not learn statistics well if they were permitted to use canned computer programs has not been realized in our experience. A careful monitoring of achievement levels before and after the introduction of computers in the teaching of our course revealed no appreciable change in students’ performances.”

  • 第一个句子中,主语是 fear,主语动词是 as not been realized,还是被动动词。第二个句子中,主语是 monitoring,可以考虑动词化它为 to monitor,主语动词是无聊的 revealed
  • 第一个句子中还有两个 not 构成的 negative 结构,听上去很尴尬,应该果断删掉,转换为 positive 结构。
  • 第二个句子中 appreciable 是一个限定词(hedge word),非常模棱两可。

Many teachers feared that the use of canned computer programs would prevent students from learning statistics. We monitored student achievement levels before and after the introduction of computers in our course and found no detriments in performance.

再来个例子:

Review of each center’s progress in recruitment is important to ensure that the cost involved in maintaining each center’s participation is worthwhile.”

  • 主语是 Review,可以动词化。有两个 is 这种 to be 结构应该转换为主动结构。
  • 存在一些非常空的和模糊的描述词汇,如 important, worthwhile。要具体点。
  • 还有一些非常笨拙的短语,如 involved in maintaining,很尴尬。

We should review each center’s recruitment progress to make sure its continued participation is cost-effective.

再再来个例子:

“It should be emphasized that these proportions generally are not the result of significant increases in moderate and severe injuries, but in many instances reflect mildly injured persons not being seen at a hospital.”

  • 首先是所谓“清嗓子” (dead weight) 的短语:It should be emphasized that,删掉。写在文章中的都是强调的。
  • these proportions 这个需要换成更具象的形容词会好一些。shift proportion 这次更好。
  • generally 副词要删掉。
  • 冗长的词:the result ofin many instances 换成 due tooften
  • 有两个 not,要改用 positive 结构。
  • being seen 这个尴尬的 to be 结构也要改。

Shifting proportions in injury severity may reflect stricter hospital admission criteria rather than true increases in moderate and severe injuries.

再再再来个例子:

Important studies to examine the descriptive epidemiology of autism, including the prevalence and changes in the characteristics of the population over time, have begun.

  • 首先,主语和谓语动词之间的距离太远了。主语是 studies 而句末 begun 才出现。
  • 要主要含糊不清的字眼。比如说 important
  • 多余的描述:over time,因为不能进行不随时间变化的更改。
  • 最后,of the population 这种说法是相当模糊的,直接删掉。

Studies have begun to describe the epidemiology of autism, including recent changes in the disorder’s prevalence and characteristics.

再再再再来个例子:

There are multiple other mechanisms that are important, but most of them are suspected to only have a small impact or are only important because of impact on one of the three primary mechanisms.

  • 首先 There are 应该删掉;important 也要换掉;are suspected to 这又是一个限定词,而且它还是被动语态。impact 名词可以换作动词。

Multiple other mechanisms play only a small role or work by impacting one of the three primary mechanisms.

再再再再再来个几个例子:

After rejecting paths with poor signal-tonoise ratios, we were left with 678 velocity measurements of waves with 7.5 seconds period and 891 measurements of 15 second waves.

Rejecting paths with poor signal-to-noise ratios left 678 velocity measurements of 7.5-second waves and 891 of 15-second waves.

It is suspected that the importance of temperature has more to do with impacting rates of other reactions than being a mechanism of disinfection itself since ponds are rarely hot enough for temperature alone to cause disinfection.

Ponds are rarely hot enough for temperature alone to cause disinfection; thus, the effect of temperature is likely mediated through its impact on the rates of other reactions.

It was assumed that due to reduced work at the joints of the lower limbs and less energy loss in the prosthetic leg, running with the dedicated prostheses allows for maximum sprinting at lower metabolic costs than in the healthy ankle joint complex.

(这是摘要的最后一句话)

The prosthetic leg reduces work and energy loss compared with a healthy ankle joint, which may lead to lower metabolic costs during maximum sprinting.


2.6 A few grammar tips

“Data are” not “Data is” …

The word “data” is plural.

Data 这个词要当做复数哦。猪油在讨论一个数据点时才使用单数形式。

ex:

These data show an unusual trend.

The data support the conclusion.

The data are critical.

(v. datum, singular form)

Affect vs. effect

  • Affect is the verb “to influence”

    The class affected her.

    • As a noun, affect denotes feeling or emotion shown by facial expression or body language, as in “The soldiers seen on television had been carefully chosen for blandness of affect” (Norman Mailer).
  • Effect is the noun form of this influence

    The class had an effect on her.

    • As a verb, effect means to bring about or to cause, as in “to effect a change”

一般来说,一个动词一个名词就可以区分好。

但也有一些非常少见的例外,比如说在心理学里 affect 是指一种感觉,一种情绪或一种表达。effect 的动词形式,非常特殊的情况下使用,比如 someone effected a change,表示某人带来了改变。

With

  • Compare to = to point out similarities between different things
  • Compare with ** (used more often in science) $=$ to point out differences between similar things

ex:

“Shall I compare thee to a summer’s day?” Brain tumors are relatively rare compared with more common cancers, such as those of the lung, breast, and prostate.

实际上 compared to 和 compared with 是不同的。前者是你想要指出不同事物之间的相似性,常有隐喻的含义。后者在科学文献中才是非常常用的。

That vs. which

  • “That” is the restrictive (defining) pronoun

  • “Which” is the nonrestrictive (non-defining) pronoun

That 一般用在当你有限制性的或者基本的从句时,which 是在非限制性或者非必要性从句时候适用。用逗号就可以简单区分。

What’s the difference between these two?

The vial that contained her RNA was lost.

The vial, which contained her RNA, was lost.

上面的第一句是在暗示有很多 vial,丢了的 vial 就是含有她的 RNA。第二句是只有一瓶大家都知道的 vial 丢了,含有着她的 RNA,删掉逗号之间部分内容不影响全句。

Example:

Other disorders which have been found to co-occur with diabetes include heart disease and foot problems.

上面的 which 要换掉!

  • Key question: Is your clause essential or nonessential?

    关键问题是要问:你的从句到底是必要的还是非必要的?

    • THAT: The essential clause cannot be eliminated without changing the meaning of the sentence.
    • WHICH: The non-essential clause can be eliminated without altering the basic meaning of the sentence (and must be set off by commas).

Example:

The bike that is broken is in the garage. (Identifies which bike of many.)

The bike, which is broken, is in the garage. (Adds a fact about the only bike in question).

“Careful writers, watchful for small conveniences, go which-hunting, remove the defining whiches, and by doing so improve their work.”

——Strunk and White

From physicist Richard Feynman:

“When we say we are a pile of atoms, we do not mean we are merely a pile of atoms because a pile of atoms which is not repeated from one to the other might well have the possibilities which you see before you in the mirror.”

应该把上面的 which 换成 that。

Stroke incidence data are obtained from sources, which use the ICD (International code of Diseases) classification systems.

上面的 which 应该去掉逗号,用 that

Sigular antecedents

Do not use “they” or “their” when the subject is singular. To avoid gender choice, turn to a plura!!

当句子的主语是单数时,不要用 they 或 their。也要避免性别选择。

Each student worries about their grade. (Wrong)

Each student worries about her grade. (Not good)

Better:

All students worry about their grades.


2.7 Demo Edit 2

(略)

S 变换 (Stockwel transform)

2020年11月21日 08:00
此文仅是处于研究需要的部分笔记和整理。

时频分析示例

REF: https://www.doc88.com/p-0126171299162.html

短时傅里叶变换

短时傅里叶变换是一种单一分辨率的信号分析方法,它的思想是选择一个时频局部化的窗函数,假定分析窗函数g(t)在一个短时间间隔内是平稳(伪平稳)的,移动窗函数,使f(t)g(t)在不同的有限时间宽度 内是平稳信号,从而计算出各个不同时刻的功率谱。短时傅里叶变换使用一个固定的窗函数,窗函数一旦确定了以后,其形状就不再发生改变, 短时傅里叶变换的分新率也就确定了。如果要改变分新率,则需要重新选择窗函数。短时傅里叶变换用来分析分段平稳信号或者近似平稳信号犹可,但是对于非平稳信号,当信号变化剧烈时,要求窗函数有较高的时间分辨率; 而波形变化比较平缓的时刻,主要是低频信号,则要求窗函数有较高的频率分辨率。短时傅里叶变换不能兼顾频率和时间分辨率的需求,由于其窗函数受到Heisenberg不确定准则的限制,时频窗的面积本小于2。这也就从另一个侧面说明了短时傅里叶变换窗函数的时间与频率分辨率不能同时达到最优。

$$F(\omega, t)=\int_{-\infty}^{\infty} e^{-i\omega t’} g\left(t^{\prime}-t\right) f\left(t^{\prime}\right) \mathrm{d} t^{\prime}$$ $$f\left(t^{\prime}\right)=\frac{1}{2 \pi} \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} F(\omega, t) g\left(t-t^{\prime}\right) \mathrm{e}^{-i\omega t^{\prime}} \mathrm{d} \omega \mathrm{d} t$$

Gabor 变换

Gabor变换是Heisenberg不确定准则下的最优的短时傅里叶变换。高斯窗函数是短时傅里叶变换同时追求时间分辨率与频率分辨率时的最优窗函数。具有高斯窗函数的短时傅里叶变换就是Gabor 变换。与短时傅里叶变换一样,Gabor变换也是单一分辨率的。

$$g_{a}(x)=\frac{1}{2 \sqrt{\pi a}} \mathrm{e}^{-\frac{x^{2}}{4 a}}, a>0$$

Wigner-Ville 分布

Wigner-Ville分布,是一种最基本的时频分布,提供了信号清晰的时频关系。它有许多特有的性质,基于这些性质,有着多方面的应用。同时,WVD是时频分布方法的基础,其它的各种分布都是在它的概念上发展起来的。对于单分量线性调频信号,Wigner分布具有很好的时频聚集性,但是这种分布也有它的缺点,例如,正负性,当分析含有多个成分的信号时,分布存在着交叉项,影响了人们对时频分布的正确解释。

小波变换

小波变换使用一个窗函数(小波函数),时频窗面积不变, 但形状可改变。小波函数根据需要调整时间与频率分辨率,具有多分辨分析(Multi-resolution Analysis)的特点,克服了短时傅里叶变换分析非平稳信号单一分辨率的困难。小波变换是一种时间-尺度分析方法,而且在时间、尺度(频率)两域都具有表征信号局部特征的能力,在低频部分具有较高的频率分辨率和较低的时间分辨率,在高频部分具有较高的时间分辨率和较低的频率分辨率,很适合于探测正常信号中夹带的瞬间反常现象并展示其成分。所以,小波变换被称为分析信号的显微镜,但是小波分析不能完全取代傅里叶分析,小波分析是傅里叶分析的发展。

S 变换

S 变换是一种可逆的时频分析方法,它是短时窗傅立叶变换和小波变换的结合。它克服了短时窗傅立叶变换不能调节分析窗口频率的问题,同时引入了小波变换的多分辨率分析,且与傅立叶频谱保持直接的联系,针对地震资料的特点有很好的时频分析能力。Note: $\int_{-\infty}^{+\infty} S(\tau, f) d \tau=H(f)$ , $H(f)$ 为 $h(t)$ 的傅立叶变换。

$$ S(\tau, f)=\int_{-\infty}^{+\infty} h(t) \frac{|f|}{\sqrt{2 \pi}} e^{\frac{(\tau-t)^{2} f^{2}}{2}} e^{-i 2 \pi f t} d t $$ $$ h(t)=\int_{-\infty}^{+\infty}{\int_{-\infty}^{+\infty} S(\tau, f) d \tau} e^{i 2 \pi ft} d f $$

S 变换可以很容易的写成 h(t)的傅立叶变换 H(f)形式:

$$ S(\tau, f)=\int_{-\infty}^{+\infty} H(\alpha+f) e^{-\frac{2 \pi^{2} \alpha^{2}}{f^{2}}} e^{i 2 \pi \alpha \tau} d \alpha $$

S 变换是短时傅立叶变换和小波变换的结合,所以包含了它们的线性特征。对于含噪音的信号,记为: $\operatorname{data}(t)=\operatorname{signal} (t)+\text { noise }(t)$,S 变换给出了以下公式:

$$S T_{\text {data}}=S T_{\text {signal}}+S T_{\text {noise}}$$


Code


Book


概述

  • Stockwel l R G,Mansinha L P,Lowe R. Localization of the complex spectrum: the S transform. IEEE Transactions on Signal Procesing,1996,44(4):998-1001.

  • 郑成龙,王宝善. S 变换在地震资料处理中的应用及展望. 地球物理学进展 ,2015,30(4):1580-1591.

    Zheng Chenglong,Wang Baoshan.Application of S transform in seismic data procesing.Progres in Geo- physics,2015,30(4):1580-1591.

  • 刁瑞,单联瑜,尚新民等. 微地震监测数据时频域去噪方法 [J].物探与化探 ,2015,39(1):112—117.

国内外研究进展

S 逆变换的信号重构过程

  • Pinnegar (2003) 等提出使用时变滤波函数剔除随机噪声;

    Pinnegar C R and Eaton D W.Application of S transform to prestack noise atenuation filtering. Journal of Geophysical Research,2003,108(B9):2422-2443.

  • 高静怀 (2004) 等利用噪声与有效信号间的统计特性差别,在广义 S 变换域识别噪声与有效信号,用 4 个待 定参数的调幅简谐波来代替 S 变换中的基本小波,说明了薄层识别的可行性;

    高静怀,陈文超,李幼铭等. 广义 S 变换与薄互层地震响应分析. 地球物理学报,2003,46(4):526-532.

    Gao Jinghuai,Chen Wenchao,Li Youming et al. Generalized S transform and seismic response analysis of thin interbeds.Chinese Journal of Geophysics,2003,46(4):526-532.

    熊晓军 (2006) 等在其基础上从检测不同信号的到达时间、识别薄层和制作高分辨剖面三个方面研究了广义 S 变换在地震高分辨处理方面的应用;

    熊晓军,贺振华,黄德济等. 广义 S 变换在地震高分辨处理中的应用. 勘探地球物理进展,2006,29(6): 415-418.

    Xiong Xiaojun,He Zhenhua,Huang Deji et al. Appliation of generalized S transform in seismic high reso- lution procesing. Progres in Exploration Geophysics. 2006,29(6):415-418.

    金国平 (2009) 在其基础上结合改进的反褶积方法对地震资料进行了拓频处理;

    金国平. 广义 S 变换在地震高分辨率处理中的应用. 石油仪器,2009,23(3):51-53.

    Jin Guoping. Application of generalized S transformation to improve seismic data resolution. PI,2009, 23(3):51-53.

    刘喜武 (2006) 等提出了基于广义 S 变换的吸收衰减补偿方法;

    刘喜武,年静波,刘洪. 基于广义 S 变换的吸收衰减补偿方法. 石油物探,2006,45(1):9-14.

    Liu Xiwu,Nian Jingbo,Liu Hong. Generalized S-transform based compensation for stratigraphic absorption of seismic atenuation.GPP,2006,45(1): 9-14.

    孙雷鸣 (2011) 等在其基础上进行了改进,利用指数变换的修正规范方差模对补偿后的频谱进行了修正。

    孙雷鸣,万欢,陈辉等. 基于广义 S 变换地震高分辨率处理方法的改进及在流花11-1油田的应用. 中国海上油气,2011,23(4):234-237.

    Sun Leiming,Wan Huan,Chen Hui et al. An improved method of sesmic high-resolution procesing based on generalized S transform and its application in LH11-1oilfield. China Ofshore Oil and Gas,2011,23(4):234-237.

  • 陈学华 (2006) 等通过引入调节参数 $p$ 和 $\lambda$ 对 S 变换的窗函数进行改造,在信号的时频域中设计两种时频滤波器,滤除特定时频域中的噪声;

    陈学华,贺振华,黄德济.基于广义S变换的信号提取与抑噪. 成都理工大学学报 (自然科学版),2006, 33(4):331-335.

    Chen Xuehua,He Zhenhua,Huang Deji.Signal extraction and noise suppresion based on generalized S transform.Journal of Chengdu University of Technology (Natural Science Edition),2006,33(4):331-335.

    $$ w(t, f)=\frac{\lambda|f|^{p}}{\sqrt{2 \pi}} \mathrm{e}^{\frac{-\lambda^{2} f^{2p} t^2}{2}} $$

  • Schimmel (2005) 等指出基于反 S 变换时频滤波方法存在的问题,并提出一种新的反 S 变换方法,用于时频滤波;

    Schimmel M and Galart J.The inverse S transform in filters with time-frequency localization.IEEE Transactions on Signal Procesing,2005,53(11):4417-4422.

  • 赵淑红 (2007) 等证明了基于 S 变换的时频滤波去噪方法,可以克服传统方法中滤波因子不能随时间、频率而变化的缺陷;

    赵淑红,朱光明. S 变换时频滤波去噪方法. 石油地球物理勘探,2007,42(4):402-406.

    Zhao Shuhong,Zhu Guangming. S-transform time-frequency filter de-noising method.OGP,2007,42(4): 402-406.

    $$ H(n, k)=\begin{cases} 1 & k \in\left[f_{i}(n)-B(n) / 2, f_{i}(n)+B(n) / 2\right] \\ 0 & \text { 其他 } \end{cases}. $$

  • 张晓峰 (2010) 提出对信号进行广义 S 变换后,对获得的含噪时频剖面选取适当阈值函数压制噪声,从而提取有效信号重构去噪后的地震记录。

    张晓峰. 基于广义 S 变换的地震资料信噪分离方法. 物化探计算技术,2010,32(5):480-483.

    Zhang Xiaofeng. Signal segmentation of seismic data based on generalized S transform. Computing Techniques for Geophysical and Geochemical Exploration,2010,32(5):480-483.

基于 S 变换的低信噪比阈值滤波

  • 金智尹 (2015) 等在 S 变换的基础上,根据信号时频分布构造三种高斯邻域局部阈值滤波方法用于高精度滤波去噪;

    金智尹,柏强. 基于广义S变换的高斯领域时频滤波方法. 电子测量与仪器学报,2015,29(1):125-131.

    Jin Zhiyin,Bai Qiang. Time-frequency filtering in Gausian domain based on generalized S-transform. Journal of Electronics & Measurements,2015,29(1): 125-131.

  • 刘永春 (2011) 等比较了基于广义 S 变换的三种时频率滤波方法对声波反射信号的去噪效果;

    刘永春,童敏明,陈琳等. 基于广义S变换的声发射信号去噪研究. 计算机应用研究,2011,28(12):4535-4536.

    Liu Yongchun,Tong Minming,Chen Lin et al. Research on de-noising of acoustic emision signals based on generalized S transform. Application Research of Computers,2011,28(12):4535-4536.

  • 李雪英 (2011) 等对基于广义 S 变换和经验模态分解的高频噪声压制方法在去噪原理、去噪效果、计算效率、保真度等方面进行了对比分析。

    李雪英,孙丹,侯相辉等. 基于广义 S 变换、经验模态分解叠前去噪方法的比较. 地球物理学进展,2011, 26(6):2039-2045.

    Li Xueying,Sun Dan,Hou Xianghui et al. Comparison of pre-stack de-noising method based on generalized S-transform and empirical mode decomposition.Progres in Geophysics,2011,26(6):2039-2045.

  • 张岩 (2017) 等依据图像块分组稀疏表示思想,利用邻近地震数据块之间多个地震记录道的相似性,提出基于多道相似组稀疏表示的地震数据随机噪声压制方法;

    张岩,任伟建,唐国维. 利用多道相似组稀疏表示方法压制随机噪声. 石油地球物理勘探,2017 ,52 (3 ):442-450.

    Zhang Yan,Ren Weijian,Tang Guowei. Random noise suppresion based on sparse representation of multitrace similarity group.OGP,2017,52(3):442-450.

  • 张华 (2017) 等提出使用二维曲波变换对含噪三维地震数据的时间切片进行多尺度、多方向分解,在曲波域采用阈值法压制随机噪声;

    张华,陈小宏,李红星等. 曲波变换三维地震数据去噪技术. 石油地球物理勘探,2017,52(2):226-232.

    Zhang Hua,Chen Xiaohong,Li Hongxing et al. 3D seismic data de-noising approach based on Curvelet transform .OGP,2017,52(2 ):226-232.

  • 张恒磊 (2017) 等基于反射波各向异性、随机噪声各向同性特征,通过构造各向异性函数进行加权叠加的数据保真处理,实现地震数据保真去噪;

    张恒磊,胡哲,胡祥云等. 基于反射波各向异性特征的保真去噪方法. 石油地球物理勘探,2017 ,52 (2 ):233-241.

    Zhang Henglei,Hu Zhe,Hu Xianyun et al. Seismic fidelity denoising method with reflection anisotropy. OGP,2017,52(2):233-241.

  • 宋维琪 (2015) 等针对地面微地震检测资料中噪声特点研究地面微地震资料 τ-p 变换去噪方法;

    宋维琪,刘太伟. 地面微地震资料 τ-p 变换噪声压制. 石油地球物理勘探,2015,50(1):48-53.

    Song Weiqi,Liu Taiwei. Noise suppresion by τ-p transform of surface microseismic data.OGP,2015, 50(1):48-53

  • 赵军龙 (2016) 等比较分析了小波去噪算法与希尔伯特—黄变换滤波算法在不同信噪比情况下对常规测井曲线的滤波效果的差异;

    赵军龙,刘建建. 常规测井曲线的小波和希尔伯特—黄变换滤波效果分析. 石油地球物理勘探,2016,51(4):801-808.

    Zhao Junlong,Liu Jianjian. Analysis of wavelet and Hilbert-Huang transform filtering of conventional wel logging curve.OGP,2016,51(4):801-808 .

调整窗函数获得多种非线性变化特征

  • Mansinha (1997) 等用 (f/r) 代替 f,得到调谐的高斯函数,允许使用者自定 S 变换在时频面上时间和频率的分辨率;

    Mansinha L,Stockwell R G,Lowe R P,et al. Local S—spectrum analysis of 1一D and 2一D data [J]. Physics of the Earth and Planetary Interiors,1997,103(3):329—336.

  • Pinnegar (2003) 等提出用非对称的双曲窗代替高斯窗,用于地震波的 P 波首波时间的判定;

    Pinnegar C R,Mansinha L. Time-local spectral analysis for non-stationary time series: the S-transform for noisy signals [J]. Fluctuation and Noise Leters,2003,3(03):L357-L364.

  • Wang (2015) 等基于信号的广义S变换时频域数据,根据有效信号与噪声的能量差异提出一种新的数据自适应滤波算法,用于抑制有效信号在时频域中的随机噪声;

    Wang D,Wang J,Liu Y et al. An adaptive time-frequency filtering algorithm for multi-component LFM signals based on generalized S-transform.IEEE International Conference on Automation and Computing,2015,1-6.

  • Duan Li (2013) 等提出一种修正的 S 变换新方法,对高斯窗函数进行改进,利用一个线性频率方程代替高斯窗的频率,调节时窗宽度随频率呈反比变化的速度,提高了 S 变换在具体应用中的实用性和灵活性;

    Li D,Castagna J. Modified S-transform in time-frequency analysis of seismic data [C] // 2013 SEG Annual Meeting. Society of Exploration Geophysicists,2013.

  • 张先武 (2013) 等在 S 变换的时窗函数中引入调节参数的同时加入低通滤波函数,推导出一种新的广义 S 变换算法,并使用该算法对探地雷达数据进行层位识别,取得了较好效果;

    张先武,高云泽,方广有. 带有低通滤波的广义 S 变换在探地雷达层位识别中的应用. 地球物理学报,2013 , 56(1):309-316.

    Zhang Xianwu,Gao Yunze,Fang Guangyou. Application of generalized S-transform with low-pass filter in ground penetrating radar layer recognition. Chinese Journal of Geophysics,2013,56(1):309-316.

  • 黄捍东 (2014) 等在广义 S 变换 (陈学华 et al. 的窗口函数) 实现时引入一系列函数库和快速傅里叶变换,使运算简洁高效,并通过选取合适参数组合对时频谱进行能量重新分配重构,获得高分辨率的地震信号。

    黄捍东,冯娜. 广义 S 变换地震高分辨率处理方法研究. 石油地球物理勘探,2014,49(1):82-88.

    Hunag Handong,Feng Na. High-resolution seismic procesing method based on generalized S transform. OGP,2014,49(1):82-88.

    $$ w(t, f)=\frac{\lambda|f|^{p}}{\sqrt{2 \pi}} \mathrm{e}^{\frac{-\lambda^{2} f^{2p} t^2}{2}} $$

    $$ \text{GST}_{x}(\tau, f)=\Big(x(\tau) {e}^{-i 2 \pi f z}\Big) \star\Big(\frac{\lambda|f|^{p}}{\sqrt{2 \pi}} {e}^{\frac{-\lambda^{2} f^{2 p} x^{2}}{2}}\Big) $$ $$ \text{GST}_{x}(i, j)=[e_{j}(i) \cdot x(i \Delta t)] * w_{j}(i) $$

  • 阮清青 (2017) 等人利用频率相关的一阶线性方程来代替归一化高斯窗口函数中的频率。

    阮清青,张会星,王昊,李凯瑞. 修正 S 变换与常规时频分析方法的对比. 中国煤炭地质, 1674—1803(2017)04-0066-07

    Ruan Qingqing,Zhang Huixing,Wang Hao and Li Kairui. Comparison of Modified S-transform (MST) and Traditional Time-Frequency Analysis Methods. 1674—1803(2017)04-0066-07

    $$ w(t, f)=\frac{|mf+n|}{\sqrt{2 \pi}} {e}^{\frac{- |mf+n|^2 t^2}{2}} $$

  • 曹鹏涛 (2018) 等在 Pinnegar 方法基础上,采用非对称双曲线窗函数,改进时频滤波函数,联合高斯平滑滤波函数提出一种数据 自适应的高频噪声压制方法;

    曹鹏涛,张敏,李振春. 基于广义 S 变换及高斯平滑的自适应滤波去噪方法. 石油地球物理勘探,2018,53(6): 1128-1136,1187.

Interactive GW simulation in JavaScript for NSs or BBHs

2020年11月16日 08:00

👉 Click me to see simulation 👈

数据来源 (Data Sources):


  • bns_hypnotise_small:
  • bns_hypnotise_small_tm_circular:
  • bns_hypnotise_small_tm_linear:
  • from_bns_1:
  • iota_0:
  • iota_45:
  • iota_90:
  • small_anim:

Linux/Unix 中 Screen 命令详解

2020年11月10日 08:00

Screen 是一款由 GNU 计划开发的用于命令行终端切换的自由软件。用户可以通过该软件同时连接多个本地或远程的命令行会话,并在其间自由切换。GNU Screen 可以看作是窗口管理器的命令行界面版本。它提供了统一的管理多个会话的界面和相应的功能。

  • 会话恢复

只要 Screen 本身没有终止,在其内部运行的会话都可以恢复。这一点对于远程登录的用户特别有用——即使网络连接中断,用户也不会失去对已经打开的命令行会话的控制。只要再次登录到主机上执行 screen -r 就可以恢复会话的运行。同样在暂时离开的时候,也可以执行分离命令 detach,在保证里面的程序正常运行的情况下让 Screen 挂起(切换到后台)。这一点和图形界面下的 VNC 很相似。

  • 多窗口

在 Screen 环境下,所有的会话都独立的运行,并拥有各自的编号、输入、输出和窗口缓存。用户可以通过快捷键在不同的窗口下切换,并可以自由的重定向各个窗口的输入和输出。Screen实现了基本的文本操作,如复制粘贴等;还提供了类似滚动条的功能,可以查看窗口状况的历史记录。窗口还可以被分区和命名,还可以监视后台窗口的活动。 会话共享 Screen 可以让一个或多个用户从不同终端多次登录一个会话,并共享会话的所有特性(比如可以看到完全相同的输出)。它同时提供了窗口访问权限的机制,可以对窗口进行密码保护。

GNU’s Screen 官方站点::http://www.gnu.org/software/screen/

screen 安装

  1. 从官方站点上下载某版本(x.x.x)源码安装包,如:screen-x.x.x.tar.gz
  2. 解压:tar -zxvf screen-x.x.x.tar.gz
  3. 进入解压后的目录:cd screen-x.x.x
  4. 最后,编译即可:
    $>bash configure
    $>make
    $>make install
    

screen 命令语法

$> screen [-AmRvx -ls -wipe][-d <作业名称>][-h <行数>][-r <作业名称>][-s ][-S <作业名称>]

-A  将所有的视窗都调整为目前终端机的大小。

-d <作业名称>  将指定的screen作业离线。

-h <行数>  指定视窗的缓冲区行数。

-m  即使目前已在作业中的screen作业,仍强制建立新的screen作业。

-r <作业名称>  恢复离线的screen作业。

-R  先试图恢复离线的作业。若找不到离线的作业,即建立新的screen作业。

-s  指定建立新视窗时,所要执行的shell。

-S <作业名称>  指定screen作业的名称。

-v  显示版本信息。

-x  恢复之前离线的screen作业。(会话共享)

-ls--list  显示目前所有的screen作业。

-wipe  检查目前所有的screen作业,并删除已经无法使用的screen作业。

常用 screen 命令的参数 | 在主终端界面下

参数很多记不住?没关系,可以先记住下面👇这几个最常用的参数:

$>screen -S yourname #-> 新建一个叫 yourname 的 session
$>screen -ls #-> 列出当前所有的 session
$>screen -r yourname #-> 回到 yourname 这个session
$>screen -x yourname #-> 回到 yourname 这个已经 attached session
$>screen -d yourname #-> 远程 detach 某个session
$>screen -d -r yourname #-> 结束当前 session 并回到 yourname 这个 session

常用 screen 的快捷键 | 在 Session 终端下 Ctrl+a(C-a)

除了要熟悉在自己主终端界面下的 screen 指令外,还一定要会使用快捷键远程控制不同的 Session 终端窗口。在每个 screen session 下,所有命令都以 Ctrl+a(C-a) 开始。 下面👇列出了一系列 Emacs 风格的快捷按键: 例如第一个指令表示的是按住 Ctrl 的同时,依次键入 a?

C-a ? 显示所有键绑定信息

C-a c 创建一个新的运行 shell 的窗口并切换到该窗口

C-a n Next,切换到下一个 window

C-a p Previous,切换到前一个 window

C-a 0..9 切换到第 0..9 个 window

Ctrl+a [Space] 由视窗0循序切换到视窗9

C-a C-a 在两个最近使用的 window 间切换

C-a x 锁住当前的 window,需用用户密码解锁

C-a d detach,暂时离开当前session,将目前的 screen session (可能含有多个 windows) 丢到后台执行,并会回到还没进 screen 时的状态,此时在 screen session 里,每个 window 内运行的 process (无论是前台/后台)都在继续执行,即使 logout 也不影响。

C-a z 把当前session放到后台执行,用 shell 的 fg 命令则可回去。

C-a w 显示所有窗口列表

C-a t time,显示当前时间,和系统的 load

C-a k kill window,强行关闭当前的 window

C-a [ 进入 copy mode,在 copy mode 下可以回滚、搜索、复制就像用使用 vi 一样

  • C-b Backward,PageUp
  • C-f Forward,PageDown
  • H(大写) High,将光标移至左上角
  • L Low,将光标移至左下角
  • 0 移到行首
  • $ 行末
  • w forward one word,以字为单位往前移
  • b backward one word,以字为单位往后移
  • Space 第一次按为标记区起点,第二次按为终点
  • Esc 结束 copy mode

C-a ] paste,把刚刚在 copy mode 选定的内容贴上

常用操作教程

创建会话

当你安装完成后,直接敲命令 screen 就可以启动它,但是这样启动的 screen 会话没有名字,实践上推荐为每个 screen 会话取一个名字,方便分辨:

$>screen -S david

screen 启动后,会创建第一个窗口,也就是窗口 No. 0,并在其中打开一个系统默认的 shell,一般都会是 bash。所以你敲入命令 screen 之后,会立刻又返回到命令提示符,仿佛什么也没有发生似的,其实你已经进入 Screen 的世界了。当然,也可以在 screen 命令之后加入你喜欢的参数,使之直接打开你指定的程序,例如:

$>screen vi david.txt

这是 screen 创建一个执行 vi david.txt 的单窗口会话,退出 vi 将退出该窗口/会话。

查看窗口和窗口名称

打开多个窗口后,可以使用快捷键 C-a w 列出当前所有窗口。如果使用文本终端,这个列表会列在屏幕左下角,如果使用 X 环境下的终端模拟器,这个列表会列在标题栏里。窗口列表的样子一般是这样:

0$ bash 1-$ bash 2*$ bash

可以看到,这个例子中我们开启了三个窗口,其中 * 号表示当前位于窗口2,- 号表示上一次切换窗口时位于窗口1。

Screen 默认会为窗口命名为编号和窗口中运行程序名的组合,上面的例子中窗口都是默认名字。练习了上面查看窗口的方法,你可能就希望各个窗口可以有不同的名字以方便区分了。可以使用快捷键 C-a A 来为当前窗口重命名,按下快捷键后,Screen 会允许你为当前窗口输入新的名字,回车确认。

会话分离与恢复

你可以不中断 Screen 窗口中程序的运行而暂时断开(detach)Screen 会话,并在随后时间重新连接(attach)该会话,重新控制各窗口中运行的程序。例如,我们打开一个 screen 窗口编辑 /tmp/david.txt 文件:

%>screen vi /tmp/david.txt

之后我们想暂时退出做点别的事情,比如出去散散步,那么在 screen 窗口键入 C-a d,Screen 会给出 detached 提示:

[detached]

暂时中断会话。

半个小时之后回来了,找到该 screen 会话:

$>screen -ls

我们可以用 screen -r 12865 重新连接会话,一切都还在哈!

当然,如果你在另一台机器上没有分离一个 Screen 会话,就无从恢复会话了。这时可以使用下面命令强制将这个会话从它所在的终端分离,转移到新的终端上来:

清除 dead 会话

如果由于某种原因其中一个会话死掉了(例如人为杀掉该会话),这时 screen -list 会显示该会话为 dead 状态。使用 screen -wipe 命令清除该会话:

关闭或杀死窗口

正常情况下,当你退出一个窗口中最后一个程序(通常是 bash)后,这个窗口就关闭了。另一个关闭窗口的方法是使用 C-a k,这个快捷键杀死当前的窗口,同时也将杀死这个窗口中正在运行的进程。

如果一个 Screen 会话中最后一个窗口被关闭了,那么整个 Screen 会话也就退出了,screen 进程会被终止。

除了依次退出/杀死当前 Screen 会话中所有窗口这种方法之外,还可以使用快捷键 C-a :,然后输入 quit 命令退出 Screen 会话。 需要注意的是,这样退出会杀死所有窗口并退出其中运行的所有程序。其实 C-a : 这个快捷键允许用户直接输入的命令有很多,包括分屏可以输入 split 等,这也是实现Screen 功能的一个途径,不过个人认为还是快捷键比较方便些。

会话共享 (高级)

还有一种比较好玩的会话恢复,可以实现会话共享。假设你在和朋友在不同地点以相同用户登录一台机器,然后你创建一个 screen 会话,你朋友可以在他的终端上命令:

$>screen -x

这个命令会将你朋友的终端 Attach 到你的 Screen 会话上,并且你的终端不会被 Detach。这样你就可以和朋友共享同一个会话了,如果你们当前又处于同一个窗口,那就相当于坐在同一个显示器前面,你的操作会同步演示给你朋友,你朋友的操作也会同步演示给你。当然,如果你们切换到这个会话的不同窗口中去,那还是可以分别进行不同的操作的。

会话锁定与解锁 (高级)

Screen 允许使用快捷键 C-a s 锁定会话。锁定以后,再进行任何输入屏幕都不会再有反应了。但是要注意虽然屏幕上看不到反应,但你的输入都会被 Screen 中的进程接收到。快捷键 C-a q 可以解锁一个会话。

也可以使用 C-a x 锁定会话,不同的是这样锁定之后,会话会被 Screen 所属用户的密码保护,需要输入密码才能继续访问这个会话。

发送命令到screen会话 (高级)

在 Screen 会话之外,可以通过 screen 命令操作一个 Screen 会话,这也为使用 Screen 作为脚本程序增加了便利。关于 Screen 在脚本中的应用超出了入门的范围,这里只看一个例子,体会一下在会话之外对 Screen 的操作:

$>screen -S sandy -X screen ping www.baidu.com

这个命令在一个叫做 sandy 的 screen 会话中创建一个新窗口,并在其中运行 ping 命令。

屏幕分割 (高级)

现在显示器那么大,将一个屏幕分割成不同区域显示不同的 Screen 窗口显然是个很酷的事情。可以使用快捷键 C-a S 将显示器水平分割,Screen 4.00.03 版本以后,也支持垂直分屏,快捷键是 C-a |。分屏以后,可以使用 C-a Tab 在各个区块间切换,每一区块上都可以创建窗口并在其中运行进程。

可以用 C-a X 快捷键关闭当前焦点所在的屏幕区块,也可以用 C-a Q 关闭除当前区块之外其他的所有区块。关闭的区块中的窗口并不会关闭,还可以通过窗口切换找到它。

C/P 模式和操作 (高级)

Screen 的另一个很强大的功能就是可以在不同窗口之间进行复制粘贴了。使用快捷键 C-a Esc 或者 C-a [ 可以进入 copy/paste 模式,这个模式下可以像在 vi 中一样移动光标,并可以使用空格键设置标记。其实在这个模式下有很多类似 vi 的操作,譬如使用/进行搜索,使用 y 快速标记一行,使用 w 快速标记一个单词等。关于 C/P 模式下的高级操作,其文档的这一部分有比较详细的说明。

一般情况下,可以移动光标到指定位置,按下空格设置一个开头标记,然后移动光标到结尾位置,按下空格设置第二个标记,同时会将两个标记之间的部分储存在 copy/paste buffer 中,并退出 copy/paste 模式。在正常模式下,可以使用快捷键 C-a ] 将储存在 buffer 中的内容粘贴到当前窗口。

更多 screen 功能 (高级)

同大多数 UNIX 程序一样,GNU Screen 提供了丰富强大的定制功能。你可以在 Screen 的默认两级配置文件 /etc/screenrc$HOME/.screenrc 中指定更多,例如设定screen 选项,定制绑定键,设定 screen 会话自启动窗口,启用多用户模式,定制用户访问权限控制等等。如果你愿意的话,也可以自己指定 screen 配置文件。

以多用户功能为例,Screen 默认是以单用户模式运行的,你需要在配置文件中指定 multiuser on 来打开多用户模式,通过 acl*(acladd,acldel,aclchg...) 命令,你可以灵活配置其他用户访问你的 screen 会话。更多配置文件内容请参考 screen 的 man 页。

Reference

贝叶斯深度学习前沿进展 (朱军教授)

2020年11月9日 09:00

此文是关于"第十八届中国机器学习及其应用研讨会“上,朱军教授的「讲座笔记」。内容仅涉及我个人感兴趣的要点内容。

Table of Contents

这是一个信息量很大的报告。

关于深度贝叶斯研究的动机

  1. 源自对数据或环境不确定性的建模和推断,甚至有时是对抗性的!
  2. 源自模型的不确定性。模型有庞大的参数数目,而数据还存在着大量的冗余。所以,用大模型在有限的数据上训练,就会带来不确定性。再者,单个模型的输出也完全没法表征结果的不确定性。

关于贝叶斯机器学习

Bayesian (Probabilistic) Machine Learning 的两篇 outlook/review 文章推荐:


Modeling: Deep Resolution of Bayesian ML

  • How to do Bayesian inference for DNNs?
  • How to learn hierarchically structured Bayesian models?

大体上,可以分为两类做法:

Type-1: Bayes -> DNN

用贝叶斯推断的办法来做神经网络。网络有多大,贝叶斯推断做在上面也会有同样的规模。

Type-2: DNN -> Bayes

Algorithms

  • How to compute posteriors efficiently?
  • How to learn parameters?

总体来说,有两类方法:变分方法和蒙特卡洛方法。它们各自有优缺点。

提及到的文献:

朱老师在这里提到了哈密顿蒙特卡洛方法,可以处理高维空间中后验采样的问题。它实际上是用了一个动力学系统并利用梯度信息来引导粒子的演化方向,来达到快速收敛。在理论上,不计代价的话这确实是准确的,可以收敛到最优解。但是这个方法在高维空间效率是很低的,迭代次数太多,粒子的利用效率也不高,多个粒子可能会塌缩在一起的情况。

下面讲两个例子,分别引出朱老师自己的工作:

VAE 和 VAE 所面临的问题:

Inference path 中的 q 如果定义的不合适,那么优化算法即使再好,也只能得到一个近似解,无法消除误差。

朱老师他们考虑能不能把 q 这个变分分布的假设去掉。原先是用一个参数模型来定义,那么现在隐式的表示函数形式会如何呢。这个问题虽然好提,但不好解决。首先是需要基于变分分布 q 算期望很困难,隐式的函数是不清楚其形式的。另外还需要算它的梯度也是很困难,在朱老师这里是一个 $\log q$ 来表示的,不知道长什么样的就没法求梯度。

因此就变成一个函数估计问题:能否从未知 q 分布的样本中估计出 $\log q$ 的一个梯度。即通过数据估计函数。

朱老师考虑从函数空间里来思考这个问题,想能否通过函数空间中这些样本点来把梯度构造出来。这里的小技巧类似于 PCA 和 SVD 那样做分解,是要在函数空间上做 spectral 分解,由此可以得到一个精确的表示,即上面 👆 slide 中第一个公式。公式中的正交基函数可以通过一个核函数的谱分解来实现,即上面👆 slide 中的第二个公式。

为了可以实现计算,朱老师他们用到了 Nystrom 方法,从而得到了一个实际可以计算的估计器。

进一步分析与真实梯度之间的关系。其 error 的上界可以分析出来,采样数目 M 足够多的话,红色公式部分是接近于 0。J 是做 truncation 的 level,蓝色部分在 truncation 的 level 提升时,也会下降的。

这个梯度估计可以嵌入到任何一个基于梯度的方法中,无论是变分还是蒙特卡洛。从上面的定量分析效果也是还不错的。

今年,朱老师还更进一步做了一个工作。根据不同组所构造的梯度估计器,基于非参的 score estimators,做了一个统一的理论框架。发现大家做的不同的估计器,其实对应于不同的 kernel 的选择,及其对应的正则化项选择。基于这个分析,给出了统一的收敛性分析,改进了原先的分析结果,并且 recover 了 KEF 的结果,并且还给出理论保证。

随后,朱老师非常贴心的给了一屏幕的算法文献作为参考,暂时不在这里罗列了,有些工作太新还差不到收录(当然主要都是朱老师参与的工作)。

Probabilistic Programming Library (ZhuSuan)

  • How to auto/semi-auto implement Bayesian deep learning models?

最后,朱老师谈了他们开发和维护的概率模型开源库:

里面已经支持了领域内很多很好的算法:

还有一些常用的模型库:

Example

  • What problems can be solved by Bayesian deep learning?

这部分就略过了,详细还是看 slides 吧。

Full Slide

Download the slide

深度学习: 从理论到算法 (王力威教授)

2020年11月9日 08:00

此文是关于第十八届中国机器学习及其应用研讨会中,王立威教授的讲座笔记。内容仅涉及我个人感兴趣的要点内容。

Table of Contents

王老师主要是机器学习的理论。王老师是从理论的角度来谈深度学习,并且在这个报告中提高了很多 insight 和启发。

深度学习存在的问题及其不足

Yann LeCun 曾谈到:"深度学习最大的问题就是缺乏理论"。可见,深度学习现在还处在一个非常初步的一个状态。比如其中一个很典型的例子,就是经典机器学习模型都是 under-parametrization 的,然而深度神经网络确实 over-parametrization 的,即神经网络的参数个数是远远超过了训练数据的个数。在经典传统的机器学习看来,过参数化应该学不好的,应该用欠参数化的模型。但是现在深度神经网络还很好用,可见理论需要进步啊!

王老师在简述监督学习的时候谈到:

机器学习中最核心的一个假设就是:真实数据的分布是不可知的。


探索深度学习的三大角度

  • 模型,网络的结构
  • 优化,训练的过程(本讲重点)
  • 泛化,测试的过程

关于模型

从理论的角度来对模型的表示能力进行探索的方法之一:函数逼近。

在 1989 年有一个很经典的结论,叫做 Universal Approximation Theorem

一个只有单隐藏层的神经网络,可以在一个紧集上面任意的逼近一个连续函数。Shallow (but wide) networks are universal approximator

这说明即使是一个三层浅层神经网络,其表达能力已经很强大了。

不过这个结论只强调了无限宽的浅层网络,但是并没有对宽度有一个限制,于是王老师的另一个工作就做了相关的研究:

如果网络的宽度与数据的维数相比严格的大的话,那么就一定存在可以逼近任何连续可积函数的性质;反之,如果网络的宽度小于等于数据的维数的话,则一定存在大量的函数,使得网络即使再深也不可能逼近它们的。 Deep (and thin) ReLU networks are universal approximator (LPWHW, 2017)

可见,加深网络深度对网络的表示能力是有极大的好处,但前提是网络具备基本的宽度。


关于泛化

王老师谈到,虽然大部分深度学习的神经网络都是严重的过参数化的,但是其却没有发生 over-fitting,反而有着比较好的泛化能力。这一点对经典的机器学习理论是非常困惑的,甚至是矛盾的。

ICLR 2017 Best Paper:

王老师尤其推荐上面的一篇文献,值得大家去读一读。该文没有给出任何一个定理或者算法,但是该文是很有启发的。此文通篇都在进行实验,经过详细且深入的实验发现,经典的机器学习理论和我们今天深度学习所发生的现象是矛盾的。经典机器学习的泛化理论,是不能解释深度学习中所观察到的很多现象。可见,路漫漫兮。。。。

此后,老师讲了经典的机器学习泛化理论是怎样 fail 的,以及其他人为深度学习泛化理论所做的工作。让我印象深刻的有:

研究泛化还需要考虑其他影响,比如用的什么优化算法。如 Hardt et al. (ICML15) 里就谈到由于 SGD 算法是随机的,所以泛化才有了保证。反之,如果是用 GD,你会发现泛化能力会差很多。

王老师也在这个角度有一些工作,详情可以看 slides。


关于优化

为啥 SGD 在 highly non-convex 上优化的还挺好呢?即能找到全局最小化呢?

(DLLWZ, 2019)

Gradient descent finds a global minimum in training deep neural networks despite the objective function being non-convex. The current paper proves gradient descent achieves zero training loss in polynomial time for a deep over-parameterized neural network with residual connections (ResNet). Our analysis relies on the particular structure of the Gram matrix induced by the neural network architecture. This structure allows us to show the Gram matrix is stable throughout the training process and this stability implies the global optimality of the gradient descent algorithm. We further extend our analysis to deep residual convolutional neural networks and obtain a similar convergence result.

王老师经过严格的理论证明,给出了如下结论:

只要深度神经网络的优化满足如下两个条件:

  1. 网络具有一定的宽度(足够的过参数化)
  2. 随机初始化的概率分布是一个高斯分布,其中的 variance 是精挑细选的。

那么从优化的初始点出发,用 GD (SGD) 这种一阶优化算法,是一定能够找到全局最优点。并且,走到全局最优点的速度是一个指数收敛速度(线性收敛速度)。

大概在同一时间里,有两个类似的工作提出了相近的结论:

Allen-Zhu et al. A convergence theory for deep learning via over-parameterization PDF

Zou et al. Stochastic gradient descent optimizes over-parameterized deep ReLU networks PDF

另外,王老师还提到一篇很有趣的文章:如果数学上假定网络的宽度是达到无限宽的极限时,此时的网络就会退化到一个经典的机器学习模型,即 kernel machine。

(NIPS2018)


小结

最后,王老师谈到上述的一系列工作背后,可以得到两个 insights:

  1. 当网络足够过参数化后,尽管网络的 loss landscape 是高度非凸优化的,但是其在初始化点附近的局部邻域里是有着极好的几何性质。换句话说,在初始化点的附近,虽然网络输出对于输入数据来讲是高度非线性的,但我们网络的输出对于网络的参数来讲,是一个近似线性的关系。因此,当你做一个 non-convex 优化的时候,初始点附近的邻域里面实际上是在做一个凸优化的问题;
  2. 由于充分过参数化,在初始点附近的小邻域里面还包含了一个全局最小点。

Full Slide

Download the slide

累积引力波事件率图的 python 实现

2020年11月4日 08:00

数据来源 (Data Sources):

LIGO-Virgo Cumulative Event Rate Plot O1-O3a (added a separate plot to include O3b public alerts)

Note that G1901322 contains confirmed events and public alerts. And original cumulative event rate plot was coded in MATLAB.


Loading packages and data

import numpy as np
import datetime
import matplotlib.pyplot as plt

gw_event = [20150914,20151012,20151226, # O1 events
 20170104,20170608,20170729,20170809,20170814,20170817,20170818,20170823,
 20190408,20190412,20190413,20190413,20190421,20190424,20190425,20190426,
 20190503,20190512,20190513,20190514,20190517,20190519,20190521,20190521,
 20190527,20190602,20190620,20190630,20190701,20190706,20190707,20190708,
 20190719,20190720,20190727,20190728,20190731,20190803,20190814,20190828,
 20190828,20190909,20190910,20190915,20190924,20190929,20190930, # gap between O3a and O3b
 20191105,20191109,20191129,20191204,20191205,20191213,20191215,
 20191216,20191222,20200105,20200112,20200114,20200115,20200128,20200129,
 20200208,20200213,20200219,20200224,20200225,20200302,20200311,20200316];
assert sorted(gw_event) == gw_event

datetime_event = [datetime.datetime.strptime(str(t), '%Y%m%d') for t in gw_event]
# Know more about Format Codes? see: https://docs.python.org/zh-cn/3/library/datetime.html#strftime-strptime-behavior

num_event = len(datetime_event)
print('Total number of GW events:', num_event)

Total number of GW events: 73

Pre-processing the data for drawing

O1_start = datetime.datetime(2015,9,12)
O1_end = datetime.datetime(2016,1,19)
len_O1 = O1_end - O1_start

O2_start = datetime.datetime(2016,11,30)
O2_end = datetime.datetime(2017,8,25)
len_O2 = O2_end - O2_start

O3a_start = datetime.datetime(2019,4,1)
O3a_end = datetime.datetime(2019,9,30)
len_O3a = O3a_end - O3a_start

O3b_start = datetime.datetime(2019,11,1)
O3b_end = datetime.datetime(2020,4,30)
len_O3b = O3b_end - O3b_start

total_days = len_O1 + len_O2 + len_O3a + len_O3b
O1 = len_O1
O2 = len_O1 + len_O2
O3a = len_O1 + len_O2 + len_O3a
O3b = len_O1 + len_O2 + len_O3a + len_O3b

nev_O1 = sum((O1_start <= np.asarray(datetime_event)) & (np.asarray(datetime_event) <= O1_end))
nev_O2 = sum((O2_start <= np.asarray(datetime_event)) & (np.asarray(datetime_event) <= O2_end))
nev_O3a = sum((O3a_start <= np.asarray(datetime_event)) & (np.asarray(datetime_event) <= O3a_end))
nev_O3b = sum((O3b_start <= np.asarray(datetime_event)) & (np.asarray(datetime_event) <= O3b_end))
assert num_event == nev_O1 + nev_O2 + nev_O3a + nev_O3b

print('Total of days:', total_days.days)
print('Number of days in O1/O2/O3a/O3b:', '{}/{}/{}/{}'.format(len_O1.days,len_O2.days,len_O3a.days,len_O3b.days))
print('Number of events in O1/O2/O3a/O3b:', '{}/{}/{}/{}'.format(nev_O1, nev_O2, nev_O3a, nev_O3b))

Total of days: 760

Number of days in O1/O2/O3a/O3b: 129/268/182/181

Number of events in O1/O2/O3a/O3b: 3/8/39/23

Figure: Cumulative Count of GW Events by dates.

plt.figure(figsize=(8,4))
plt.hist(datetime_event, bins=73, histtype='step', cumulative=True, color='k', linewidth=2)
plt.ylabel('Cumulative #Events/Candidates')
plt.fill_betweenx([-1,80], O1_start,O1_end , color=[230/256,179/255,179/255])
plt.fill_betweenx([-1,80], O2_start,O2_end , color=[179/256,230/255,181/255])
plt.fill_betweenx([-1,80], O3a_start,O3a_end , color=[179/256,179/255,228/255])
plt.fill_betweenx([-1,80], O3b_start,O3b_end , color=[255/256,179/255,84/255])
plt.ylim(-1,80)
plt.xlim(O1_start, O3b_end)
plt.title('''Cumulative Count of Events and (non-retracted Alerts)
O1={}, O2={}, O3a={}, O3b={}, Total={}'''.format(nev_O1,nev_O2,nev_O3a,nev_O3b,num_event))
plt.savefig('cumulative_events_by_date.png', dpi=300, bbox_inches='tight')
plt.show()

Figure: Cumulative Count of GW Events by days of running for each event.

def getstart(t):
 if (O1_start <= t) & (t <= O1_end):
 return O1_start
 elif (O2_start <= t) & (t <= O2_end):
 return O2_start - O1
 elif (O3a_start <= t) & (t <= O3a_end):
 return O3a_start - O2
 elif (O3b_start <= t) & (t <= O3b_end):
 return O3b_start - O3a
 else:
 raise
x = [ (t-getstart(t)).days for t in datetime_event]
y = range(len(datetime_event))

plt.figure(figsize=(7,5))
plt.plot(x, y, drawstyle='steps-post', color='k', linewidth=2)
plt.fill_betweenx([-1,80], 0, O1.days , color=[230/256,179/255,179/255])
plt.fill_betweenx([-1,80], O1.days, O2.days , color=[179/256,230/255,181/255])
plt.fill_betweenx([-1,80], O2.days, O3a.days , color=[179/256,179/255,228/255])
plt.fill_betweenx([-1,80], O3a.days, O3b.days , color=[255/256,179/255,84/255], )
plt.ylim(-1,80)
plt.xlim(0, O3b.days)
plt.ylabel('Cumulative #Events/Candidates')
plt.xlabel('Time (Days)')
plt.title('''Cumulative Count of Events and (non-retracted Alerts)
O1={}, O2={}, O3a={}, O3b={}, Total={}'''.format(nev_O1,nev_O2,nev_O3a,nev_O3b,num_event))
plt.text(O1.days*0.3, num_event*0.6, 'O1', fontsize=15)
plt.text(O1.days+(O2.days-O1.days)*0.4, num_event*0.6, 'O2', fontsize=15)
plt.text(O2.days+(O3a.days-O2.days)*0.3, num_event*0.6, 'O3a', fontsize=15)
plt.text(O3a.days+(O3b.days-O3a.days)*0.3, num_event*0.6, 'O3b', fontsize=15)
plt.savefig('cumulative_events_by_days.png', dpi=300, bbox_inches='tight')
plt.show()

Markdown Elements for Hugo/Wowchemy

2020年11月3日 08:00

Update to keep the very latest developments:


Sources:

Shortcodes are plugins which are bundled with Wowchemy or inherited from Hugo. Additionally, HTML may be written in Markdown documents for advanced formatting.

FYI:


YAML

Front matter allows page-specific metadata and functionality to be included at the top of a Markdown file.

In the documentation and the example site, we will predominantly use YAML to format the front matter of content files and TOML to format the configuration files and widget files. This is because TOML is more human-friendly but popular Markdown editors primarily support YAML front matter in content files.

A complete list of standard options can be found on the corresponding Hugo docs page.

The following is an informative example for this page:

## The title of your page (Core)
title: 'Markdown Elements for Hugo/Wowchemy' # (Core)
## An optional subtitle that will be displayed under the title
subtitle: "A complete tutorial for writting markdown on Wowchemy"

## A one-sentence summary of the content on your page. 
## The summary can be shown on the homepage and can also benefit your search engine ranking.
summary: 'This article gives an overview of the most common formatting options, including features that are exclusive to Wowchemy.' # (Core)

## Display the authors of the page and link to their user profiles if they exist.
authors: # (Core)
- Geo
- Herb

## Tagging your content helps users to discover similar content on your site. 
## Tags can improve search relevancy and are displayed after the page content and also in the Tag Cloud widget.
tags: # (Core)
- Python
- Markdown
- Hugo
- Wowchemy

## Categorizing your content helps users to discover similar content on your site. 
## Categories can improve search relevancy and display at the top of a page alongside a page’s metadata.
categories:
- Tutorial

## The RFC 3339 date that the page was published. 
date: "2020-11-03T00:00:00Z" # (Core)
show_date: true # Dates can now be hidden from pages by adding show_date: false in page front matter or by automatically applying it to all pages in a collection using Hugo's cascade:>show_date: false in the _index.md file.

## The RFC 3339 date that the page was published. 
## You only need to specify this option if you wish to set date 
## in the future but publish the page now, as is the case for 
## publishing a journal article that is to appear in a journal etc.
# publishDate: "2020-11-03T00:00:00Z"

## The RFC 3339 date that the page was last modified. 
## If using Git, enable enableGitInfo in `config.toml` to have 
## the page modification date automatically updated, rather than manually specifying lastmod.
lastmod: "2020-11-03T00:00:00Z"

## By setting `featured: true`, a page can be displayed in the Featured widget. 
## This is useful for sticky, announcement blog posts or selected publications etc.
featured: false

## By setting `draft: true`, only you will see your page 
## when you preview your site locally on your computer.
draft: false

## Featured image
## To use, add an image named `featured.jpg/png` to your page's folder.
## Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
## Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
image:
 placement: 1
 caption: "Credit by [**Wowchemy**](https://wowchemy.com/)"
 focal_point: "Center"
 preview_only: false
 alt_text: An optional description of the image for screen readers.


## Projects (optional).
## Associate this post with one or more of your projects.
## Simply enter your project's folder or file name without extension.
## E.g. `projects = ["internal-project"]` references `content/project/internal-project/index.md`.
## Otherwise, set `projects = []`.
projects: [GWDA]

## Page resources
## Buttons can be generated in the page header to link to associated resources.
## The example below shows how to create a Twitter link for a project and 
## how to create a link to a post that was originally published on Medium:
links:
 - icon_pack: fab
 icon: twitter
 name: Follow
 url: 'https://twitter.com/Herb_hewang' # (required)
 # - icon_pack: fab
 # icon: medium
 # name: Originally published on Medium
 # url: 'https://medium.com' # (required)

## The following parameters can be added to the front matter of 
## a page (such as a blog post) to control its features:
reading_time: true # Show estimated reading time?
share: true # Show social sharing links?
profile: true # Show author profile?
commentable: false # Allow visitors to comment? Supported by the Page, Post, and Docs content types.
editable: false # Allow visitors to edit the page? Supported by the Page, Post, and Docs content types. 

## To enable LaTeX math rendering for a page, you should include `math: true` in the page’s front matter.
## I have enabled math on the homepage or for all pages, by globally setting `math = true` in `config/_default/params`
# math: true

## Enable a Markdown extension for diagrams by toggling the diagram 
## option in your `config/_default/params.toml` file or 
## by adding `diagram: true` to your page front matter.
diagram: true

## Image gallery:
## To add an image gallery to a page bundle
# Discarded for any remote gallery images see: https://wowchemy.com/blog/v5.1.0/#apply-breaking-changes
gallery_item:
- album: 'branch-bundle-1'
 image: 'GW150914Anniversary.png'
 caption: 'Write your image caption here' # only shown when zoom out
 order: "asc" # "asc" or "desc"
 resize_options: # which supports Hugo image processing options.
# - album: gallery # can not be replaced
# image: 'sketch5.png' # `static/media/sketch5.png`
# caption: A caption # only shown when zoom out
# - album: gallery
# image: https://vip1.loli.net/2020/11/11/OmVGhaz79iQJsvj.png
# caption: Another caption # only shown when zoom out


## (Optional) Header image (relative to `assets/media/` folder).
## To display a full width header image, the header parameters below can be 
## inserted towards the end of a page’s front matter. It is assumed that the 
## image is located in your `assets/media/` media library
header: # (It not works.....)
 image: "header.png"
 caption: "Image credit: [**MLflow**](https://mlflow.org)"

Don’t want to publish author pages?

To un-publish author pages from the site, update with hugo mod get -u ./... and then create a content/authors/_index.md file with the following: (Discord)

---
_build:
 render: never
cascade:
 _build:
 render: never
 list: always
---

Sub-headings

## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6

Generate a custom heading ID by including an attribute. For example:

## Reference A {#foo}
## Reference B {id="bar"}

produces this HTML:

<h2 id="foo">Reference A</h2>
<h2 id="bar">Reference B</h2>

Hugo will generate unique element IDs if the same heading appears more than once on a page. For example:

## Reference
## Reference
## Reference

produces this HTML:

<h2 id="reference">Reference</h2>
<h2 id="reference-1">Reference</h2>
<h2 id="reference-2">Reference</h2>

Emphasis

Italics with _underscores_.

Bold with **asterisks**.

Combined emphasis with **asterisks and _underscores_**.

Strikethrough with ~~two tildes~~.

Italics with underscores.

Bold with asterisks.

Combined emphasis with asterisks and underscores.

Strikethrough with two tildes.

Lists

Ordered

1. First item
2. Another item
  1. First item
  2. Another item

Unordered

* First item
* Another item
  • First item
  • Another item

Todo

Todo lists can be written in Wowchemy by using the standard Markdown syntax:

- [x] Write math example
- [x] Write diagram example
- [ ] Do something else

renders as

Images

Images may be added to a page by either placing them in your assets/media/ (Wowchemy v5.1.0) media library or in your page’s folder, and then referencing them using one of the following notations.

A figure from your assets/media/ media library:

{{< figure library="true" src="pic01.jpg" title="A caption" >}}
A caption
A caption

A figure within a page’s folder (e.g. content/post/writting-mardown/) :

{{< figure src="pic02.jpg" title="A caption" >}}
A caption
A caption

A numbered figure with caption:

{{< figure src="pic02.jpg" title="A caption" numbered="true" >}}
A caption
A caption

Or you can use the more portable Markdown syntax for displaying an image from the page’s folder, however it has limited options compared with the Figure shortcode above:

![alternative text for search engines](pic02.jpg)

alternative text for search engines

You can now create figures using the standard, portable Markdown syntax ![screen reader text](image.jpg "caption") where the image is located in the page folder, media library, or remotely. This means that your figures can now be rendered by any Markdown editor, such as Visual Studio Code, Typora, and Obsidian :meow_wow:

  • How to disable image spotlight on click? Put lightbox="false" in your figure shortcode.

Figures may be cross-referenced.

[Requires Wowchemy v5.1+]:

  • It support custom figure IDs for easier cross-referencing
    • For example, {{< figure src="image.jpg" id="wowchemy" >}} can be cross-referenced with [A Figure](#figure-wowchemy)
    • This way, the cross-reference is unaffected by changes to captions
  • Dynamically theme images according to the user’s light or dark theme
    • {{< figure src="image.jpg" caption="test" theme="light" >}} inverts image when browsing in dark mode
    • {{< figure src="image.jpg" caption="test" theme="dark" >}} inverts image when browsing in light mode

Image gallery

To add an image gallery to a page bundle:

  1. Create a gallery album folder within your page bundle (i.e. within your page’s own folder)
  2. Add images to your new album folder (Note that all galleries from the assets/media/albums/folder)
  3. Paste {{ < gallery album="<ALBUM FOLDER>" >}} where you would like the gallery to appear in the page content, changing the album parameter to match the name of your album folder (in this case: /branch-bundle-1/)
{{< gallery album="branch-bundle-1" >}}

Note that album names should be lowercase.

Optionally, to add captions for your images, add the following instances to the end of your page’s front matter:

gallery_item:
- album: 'branch-bundle-1'
 image: 'GW150914Anniversary.png'
 caption: 'Write your image caption here' # only shown when zoom out
 order: "asc" # "asc" or "desc"
 resize_options: # which supports Hugo image processing options.

Alternatively, create an image gallery with images from the internet or your static/media/ (Wowchemy v5) media library: (# Discarded for any remote gallery images in v5.1.0, see more)

  1. Add gallery images to within your static/media/ media library folder

  2. Reference your images at the end of the front matter of a content file in the form:

    gallery_item:
    - album: gallery  # can not be replaced
     image: 'sketch5.png' # `static/media/sketch5.png`
     caption: A caption  # only shown when zoom out
    - album: gallery
     image: https://vip1.loli.net/2020/11/11/OmVGhaz79iQJsvj.png
     caption: Another caption  # only shown when zoom out 
    
  3. Display the gallery somewhere within your page content by using

    {{< gallery >}}
    
For docs pages (i.e. pages using the courses and documentation layout), gallery images must be placed in the static/ media library using the second approach (due to limitations of Hugo).

Cite

To cite a page or publication, you can use the cite shortcode (Wowchemy v5+), referencing a folder and page name that you created:

{{<cite page="/publication/2012-krizhevsky-image-net-classification-deep" view="4" >}}

where view corresponds to one of the available listing views used throughout Wowchemy:

  1. Stream:

  2. Compact: (default)

  3. Card:

  4. Traditional academic citation, configured by the citation_style setting in params.toml:

If you don’t specify a view, it will default to the compact view.

Audio

You can add a podcast or music to a page by placing the MP3 file in the page’s folder and then referencing the audio file using the audio shortcode (Wowchemy v5+):

{{< audio src="GW150914.mp3" >}}
<!-- "Chirp" ringtones from the first GW LIGO detections -->

Videos

The following kinds of video may be added to a page.

Local video file

Videos may be added to a page by either placing them in your assets/media/ media library or in your page’s folder, and then referencing them using one of the following notations (Wowchemy v5.1.0).

A video from your assets/media/ media library:

{{< video library="true" src="mf_GW150914.mp4" controls="yes" >}}

A video within a page’s folder (e.g. content/post/writting-markdown/):

{{< video src="mf_GW151226.mp4" controls="yes" >}}

Youtube

The youtube shortcode embeds a responsive video player for YouTube videos. Only the ID of the video is required, e.g.: https://youtu.be/7jNUCOayjEA. Copy the YouTube video ID that follows / in the video’s URL and pass it to the youtube shortcode:

{{< youtube 7jNUCOayjEA >}}

Furthermore, you can automatically start playback of the embedded video by setting the autoplay parameter to true. Remember that you can’t mix named and unnamed parameters, so you’ll need to assign the yet unnamed video id to the parameter id:

{{< youtube id="7jNUCOayjEA" autoplay="true" >}}

Vimeo

Adding a video from Vimeo is equivalent to the YouTube Input shortcode. E.g.: https://vimeo.com/channels/staffpicks/146022717. Extract the ID from the video’s URL and pass it to the vimeo shortcode:

{{< vimeo 146022717 >}}

If you want to further customize the visual styling of the YouTube or Vimeo output, add a class named parameter when calling the shortcode. The new class will be added to the <div> that wraps the <iframe> and will remove the inline styles. Note that you will need to call the id as a named parameter as well. You can also give the vimeo video a descriptive title with title.


 <div
class="my-vimeo-wrapper-class">
<iframe
src="https://player.vimeo.com/video/146022717?dnt=0" allow="fullscreen" title="My vimeo video">
</iframe>
</div>

Links

- [I'm a link](https://www.google.com) <!--open in new tab-->
- [I'm a link with title](https://iphysresearch.github.io/blog/about/ "Click This!") <!--hovering-->
- [A post]({{< ref "/post/ML_notes/s-dbw-validity-index/index.md" >}})
- [A publication]({{< ref "/publication/2012-krizhevsky-image-net-classification-deep/index.md" >}})
- [A project]({{< ref "/project/CS231n/index.md" >}})
- [A relative link from one post to another post]({{< relref "../DL_notes/receptive-field/index.md#感受野计算" >}})
- [Scroll down to a page section with heading *Vimeo*](#vimeo)

To enable linking to a file, such as a PDF, first place the file in your static/files/ folder and then link to it using the following form:

{{% staticref "files/LiWeiWang_DL_From_Theory_to_Algorithm.pdf" "newtab" %}}Download the PDF{{% /staticref %}}

Download the PDF

The optional "newtab" argument for staticref will cause the link to be opened in a new tab.

Figures

To cross-reference a figure:

  1. Retrieve the figure ID. The figure ID consists of a URL friendly equivalent of the image caption prefixed with figure-. To grab the exact ID, preview the page in Hugo, right click a figure and click Inspect in your browser to grab the value of the figure’s id field.
  2. Create a link to the figure in the form [a link to a figure](#figure-FIGURES-CAPTION).
<!-- Only the first three figures have the same `id` field. -->
[a link to a figure](#figure-a-caption)

a link to a figure

Tags and Categories

Use {{< list_tags >}} to provide a list of linked tags or {{< list_categories >}} to provide a list of linked categories.

Charts

Wowchemy supports the popular Plotly chart format.

Save your Plotly JSON in your page folder, for example chart.json, and then add the {{< chart data="chart" >}} shortcode where you would like the chart to appear.

Demo:

{{< chart data="contour_scatter" >}}
<!--`content/post/writting-mardown/contour_scatter.json`-->

You might also find the Plotly JSON Editor useful.

Emojis

See the Emoji cheat sheet for available emoticons. The following serves as an example, but you should remove the spaces between each emoji name and pair of semicolons:

I : heart : Wowchemy : smile :

I ❤️ Wowchemy 😄

Icons

Since v4.8+, Wowchemy enables you to use a wide range of icons from Font Awesome and Academicons in addition to emojis.

Icon pack “fab” includes the following brand icons:

  • twitter, weixin, weibo, linkedin, github, facebook, pinterest, twitch, youtube, instagram, soundcloud
  • See all icons

Icon packs “fas” and “far” include the following general icons:

  • fax, envelope (for email), comments (for discussion forum)
  • See all icons

Icon pack “ai” includes the following academic icons:

  • cv, google-scholar, arxiv, orcid, researchgate, mendeley
  • See all icons
  • To enable the academic icon pack in v5+, set ai = true under [icon.pack] in params.toml

Icon pack “emoji” gives you the ability to use emojis as icons

  • See all icons
  • Enter the emoji shortcode, such as :heart:, in Wowchemy’s icon field
  • Wowchemy v4.9+ is required to utilise the emoji icon pack and can currently only be used in the Featurette (skills) widget.

Icon pack “custom” gives you the ability to use custom SVG icons

  • Create an SVG icon in your favorite image editor or download one from a site such as Flat Icon
  • Place the custom SVG icon in assets/images/icon-pack/, creating the folders if necessary
  • Reference the SVG icon name (without .svg extension) in the icon field
  • Wowchemy v4.9+ is required to utilise the custom icon pack and can currently only be used in the Featurette (skills) widget.

Here are some examples using the icon shortcode to render icons:

{{< icon name="terminal" pack="fas" >}} Terminal
{{< icon name="chart-line" pack="fas" >}} Chart Line
{{< icon name="python" pack="fab" >}} Python
{{< icon name="r-project" pack="fab" >}} R
{{< icon name="arxiv" pack="ai" >}} ArXiv
{{< icon name="google-scholar" pack="ai" >}} Google Scholar

renders as

Terminal
Chart Line
Python
R
ArXiv
Google Scholar

Optionally, left and right padding can be added to an icon using the padding_left="3" and padding_right="3" options, respectively.

Blockquote

> This is a blockquote.

This is a blockquote.

Highlight quote

This is a {{< hl >}} highlighted quote {{< /hl >}}.

This is a highlighted quote.

Mention a user

To mention someone (Wowchemy v4.6+ required), type {{% mention "username" %}} where username corresponds to a user account in Wowchemy.

{{% mention "Herb" %}}

Herb

Footnotes

I have more [^1] to say.
I have more more [^anymark] to say.

[^1]: Footnote example.
[^anymark]: Another footnote example.

I have more 1 to say. I have more more 2 to say.

Embed Documents

The following kinds of document may be embedded into a page.

To embed Google Documents (e.g. slide deck), click File > Publish to web > Embed in Google Docs and copy the URL within the displayed src="..." attribute. Then paste the URL in the form:

{{< gdocs src="https://docs.google.com/..." >}}

renders as (my file)

To embed slide from slides.com. Just copy the URL and do the same things as following:

{{< gdocs src="https://slides.com/iphysresearch/journal_club_20201020/embed" scrolling="no" >}}

or use the full Embed <iframe> code:

<iframe src="https://slides.com/iphysresearch/journal_club_20201020/embed" width="576" height="420" scrolling="no" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen ></iframe>

Diagrams

Wowchemy supports a Markdown extension for diagrams (Wowchemy v4.4.0+ required). You can enable this feature by toggling the diagram option in your config/_default/params.toml file or by adding diagram: true to your page front matter. Then insert your Mermaid diagram syntax within a mermaid code block as seen below and that’s it.

An example flowchart:

```mermaid
graph TD;
 A-->B;
 A-->C;
 B-->D;
 C-->D;
```

renders as

graph TD; A-->B; A-->C; B-->D; C-->D;

An example sequence diagram:

```mermaid
sequenceDiagram
 participant Alice
 participant Bob
 Alice->John: Hello John, how are you?
 loop Healthcheck
 John->John: Fight against hypochondria
 end
 Note right of John: Rational thoughts <br/>prevail...
 John-->Alice: Great!
 John->Bob: How about you?
 Bob-->John: Jolly good!
```

renders as

sequenceDiagram participant Alice participant Bob Alice->John: Hello John, how are you? loop Healthcheck John->John: Fight against hypochondria end Note right of John: Rational thoughts
prevail... John-->Alice: Great! John->Bob: How about you? Bob-->John: Jolly good!

An example Gantt diagram:

```mermaid
gantt
dateFormat YYYY-MM-DD
title Adding GANTT diagram to mermaid
excludes weekdays 2014-01-10

section A section
Completed task :done, des1, 2014-01-06,2014-01-08
Active task :active, des2, 2014-01-09, 3d
Future task : des3, after des2, 5d
Future task2 : des4, after des3, 5d
```

renders as

gantt dateFormat YYYY-MM-DD title Adding GANTT diagram to mermaid excludes weekdays 2014-01-10 section A section Completed task :done, des1, 2014-01-06,2014-01-08 Active task :active, des2, 2014-01-09, 3d Future task : des3, after des2, 5d Future task2 : des4, after des3, 5d

Advanced diagrams

More advanced diagrams can be created in the open source draw.io editor. The editor has support for almost any type of diagram, from simple to complex. A diagram can be easily embedded in Wowchemy by choosing File > Embed > SVG in the draw.io editor and pasting the generated code into your page.

Alternatively, a diagram can be exported as an image from any drawing software, or a document/slide containing a diagram can be embedded.

One example from https://github.com/iphysresearch/drawio/blob/master/nifi-architecture.drawio:

OS/Host
OS/H…
JVM
JVM
Web Server
Web Se…
Flow Controller
Flow Controller
Processor 1
Processor 1
FlowFile Repository
Flow…
Content Repository
Cont…
Provenance Repository
Prov…
Extension N
Extension N
Local Storage
Local Storage
Viewer does not support full SVG 1.1

Code highlighting

Pass the language of the code, such as python, as a parameter after three backticks:

```python
# Example of code highlighting
input_string_var = input("Enter some data: ")
print("You entered: {}".format(input_string_var))
```

Result:

# Example of code highlighting
input_string_var = input("Enter some data: ")
print("You entered: {}".format(input_string_var))

The Wowchemy theme uses highlight.js for source code highlighting, and highlighting is enabled by default for all pages. However, several configuration options are supported that allow finer-grained control over highlight.js.

Jupyter Notebook

View the guide to blogging with Jupyter Notebooks.

Alternatively, a Jupyter notebook can be embedded in a page by following one of the approaches below:

  1. Upload your notebook as a GitHub Gist and click Embed to copy and paste your hosted notebook into the body of content in Wowchemy.

    Example:

    <script src="https://gist.github.com/iphysresearch/bd6a647358697bc19b32375a04969f47.js"></script>
    
  2. Convert your notebook to HTML using jupyter nbconvert --to html <NOTEBOOK_NAME>.ipynb. Then move the resulting HTML file to your page’s folder and embed it into the body of the page’s Markdown file using:

    <iframe
     src="./<CONVERTED_NOTEBOOK_FILENAME>"
     width="90%"
     height="1000px"
     style="border:none;">
    </iframe>
    
  3. Upload your notebook to a cloud notebook service such as Microsoft Azure, Google Cloud Datalab or Kyso. Then click their Embed button, pasting their custom embedding code into the body of your page’s Markdown file.

  4. Copy snippets of code from your notebook and paste them into the body of your page using Wowchemy’s code highlighting.

GitHub gist

Twitter tweet

To include a single tweet, just use the embed codes:

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Data is/are <a href="https://t.co/pmacbnnaJN">pic.twitter.com/pmacbnnaJN</a></p>&mdash; PHD Comics (@PHDcomics) <a href="https://twitter.com/PHDcomics/status/1296758564594126849?ref_src=twsrc%5Etfw">August 21, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

renders as

Data is/are pic.twitter.com/pmacbnnaJN

— PHD Comics (@PHDcomics) August 21, 2020

$\LaTeX$ math

Wowchemy supports a Markdown extension for $LaTeX$ math. You can enable this feature by toggling the math option in your config/_default/params.toml file.

To render inline or block math, wrap your LaTeX math with $...$ or $$...$$, respectively.

Example math block:

$$\gamma_{n} = \frac{
\left | \left (\mathbf x_{n} - \mathbf x_{n-1} \right )^T
\left [\nabla F (\mathbf x_{n}) - \nabla F (\mathbf x_{n-1}) \right ] \right |}
{\left \|\nabla F(\mathbf{x}_{n}) - \nabla F(\mathbf{x}_{n-1}) \right \|^2}$$

renders as

$$\gamma_{n} = \frac{ \left | \left (\mathbf x_{n} - \mathbf x_{n-1} \right )^T \left [\nabla F (\mathbf x_{n}) - \nabla F (\mathbf x_{n-1}) \right ] \right |} {\left |\nabla F(\mathbf{x}{n}) - \nabla F(\mathbf{x}{n-1}) \right |^2}$$

Example inline math $\nabla F(\mathbf{x}_{n})$ renders as $\nabla F(\mathbf{x}_{n})$.

Example multi-line math using the \\\\ math linebreak:

$$
f(k;p\_0^{*}) = \begin{cases} p\_0^{\*} & \text{if }k=1, \\\\
1-p\_0^{\*} & \text {if }k=0.\end{cases}
$$

renders as

$$ f(k;p_0^{*}) = \begin{cases} p_0^{*} & \text{if }k=1, \\ 1-p_0^{*} & \text {if }k=0.\end{cases} $$

As Hugo and Wowchemy attempt to parse YAML, Markdown, and LaTeX content in the abstract field for publications and talks, Markdown special characters need to be escaped in any math within the abstract fields by using a backslash to prevent the math being parsed as Markdown. The following tips may help:

  • escape each LaTeX backslash (\) with an extra backslash, yielding \\
  • escape each LaTeX underscore (_) with one backslashes, yielding \_
  • escape each LaTeX backslash (*) with an extra backslash, yielding \*

Hence, abstract: “${O(d_{\max})}$” becomes abstract: “${O(d\_{\max})}$”.

Table

Code:

| Command | Description |
| ------------------| ------------------------------ |
| `hugo` | Build your website. |
| `hugo serve -w` | View your website. |

Result:

Command Description
hugo Build your website.
hugo serve -w View your website.

Callouts

Wowchemy supports a Markdown extension for callouts, also referred to as alerts or asides.

Callouts are a useful feature to draw attention to important or related content such as notes, hints, or warnings in your articles. They are especially handy when writing educational tutorial-style articles or documentation.

A callout can be created by using the Callout shortcode below. (For older Wowchemy versions prior to v5, replace callout in the examples below with alert.)

Wowchemy comes built-in with a few different styles of callouts.

The paragraph will render as a callout with the default note style:

{{% callout note %}}
A Markdown callout is useful for displaying notices, hints, or definitions to your readers.
{{% /callout %}}

This will display the following note block:

A Markdown callout is useful for displaying notices, hints, or definitions to your readers.

Alternatively, a warning can be displayed to the reader using the warning option:

{{% callout warning %}}
Here's some important information...
{{% /callout %}}

This will display the following warning notice to the reader:

Here’s some important information…

Table of Contents

A table of contents may be particularly useful for long posts or tutorial/documentation type content. Use the {{% toc %}} shortcode anywhere you wish within your Markdown content to automatically generate a table of contents.

Or add side TOC for posts like this post:

Ref: Add table of contents for posts

Change the file <root dir>/layouts/_default/single.html as follows: (as default)

{{- define "main" -}}
<div class="container-fluid docs">
 <div class="row flex-xl-nowrap">

 <div class="d-none d-xl-block col-xl-2 docs-toc">
 <ul class="nav toc-top">
 <li><a href="#" id="back_to_top" class="docs-toc-title">{{ i18n "on_this_page" }}</a></li>
 </ul>
 {{ .TableOfContents }}
 {{ partial "docs_toc_foot" . }}
 </div>

 <main class="col-12 col-md-0 col-xl-10 py-md-3 pl-md-5 docs-content" role="main">
 <article class="article">

 {{ partial "page_header" . }}

 <div class="article-container">

 <div class="article-style">
 {{ .Content }}
 </div>

 {{ partial "page_footer" . }}

 </div>
 </article>
 </main>

 </div>
</div>

{{- end -}}

  1. Footnote example. ↩︎

  2. Another footnote example. ↩︎

❌
❌