Recent months have seen the emergence of a powerful new trend in
which large language models are augmented to become “agents”—software
entities capable of performing tasks on their own, ultimately in the
service of a goal, rather than simply responding to queries from human
users
从上面的例子就可以理解作者提出的办法就是将thought和action结合起来了,也就是单纯的思考比如chain
of
thought并不能很好的回答问题,受制于预训练模型自己模型内存储的知识,而如果只有action呢?就是不停的去搜索,如果搜索不到正确的答案那也是白搭。其实我理解就是作者提出我们要做一个通用的人工智能,你要告诉他在行动的时候也要思考,思考清楚之后再去考虑下一步已经采取什么样的行动,同时每一次行动也会从环境中得到反馈,比如作者举的第二个例子,你去countertop(台面)的时候,你看到了苹果,面包,胡椒粉瓶子和一个花瓶,既然我们要把胡椒粉瓶子放到抽屉里,那就可以拿走胡椒粉瓶子啦!其实这也好理解,一个优秀的人其实也是要边做边思考的,所以就形成了作者提出的prompt新范式:
1 2 3 4
Thought: ... Action: ... Observation: ... ... (Repeated many times)
思路大概就是和斯坦福的dsp一个思路,用户问一个问题,第一种方式就是直接把这个问题抛给llm,让它去回答,第二种就是在把问题抛给llm之前先过一道retrival模型,这个模型会抽取一些和回答这个问题相关的一些context,将这些context加入prompt中,然后再抛给llm。这有一个专门的名词,在Demonstrate-Search-Predict:
Composing retrieval and language models for knowledge-intensive NLP
中有所介绍:
retrieve-then-read。有兴趣的可以再看一下斯坦福在做这部分工作时出的notebook介绍,看完你就知道为啥要retrieve一些context加入到prompt中去。
with Run().context(RunConfig(nranks=4, experiment='notebook')): # nranks specifies the number of GPUs to use. config = ColBERTConfig(doc_maxlen=doc_maxlen, nbits=nbits)
{"id": "seed_task_1", "name": "antonym_relation", "instruction": "What is the relation between the given pairs?", "instances": [{"input": "Night : Day :: Right : Left", "output": "The relation between the given pairs is that they are opposites."}], "is_classification": false}
作者将三个seed_task拼接在一起,然后前面加上事先定义好的prompt:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
defencode_prompt(prompt_instructions): """Encode multiple prompt instructions into a single string.""" prompt = open("./prompt.txt").read() + "\n"
You are asked to come up with a set of 20 diverse task instructions. These task instructions will be given to a GPT model and we will evaluate the GPT model for completing the instructions.
Here are the requirements: 1. Try not to repeat the verb for each instruction to maximize diversity. 2. The language used for the instruction also should be diverse. For example, you should combine questions with imperative instrucitons. 3. The type of instructions should be diverse. The list should include diverse types of tasks like open-ended generation, classification, editing, etc. 4. A GPT language model should be able to complete the instruction. For example, do not ask the assistant to create any visual or audio output. For another example, do not ask the assistant to wake you up at 5pm or set a reminder because it cannot perform any action. 5. The instructions should be in English. 6. The instructions should be 1 to 2 sentences long. Either an imperative sentence or a question is permitted. 7. You should generate an appropriate input to the instruction. The input field should contain a specific example provided for the instruction. It should involve realistic data and should not contain simple placeholders. The input should provide substantial content to make the instruction challenging but should ideally not exceed 100 words. 8. Not all instructions require input. For example, when a instruction asks about some general information, "what is the highest peak in the world", it is not necssary to provide a specific context. In this case, we simply put "<noinput>" in the input field. 9. The output should be an appropriate response to the instruction and the input. Make sure the output is less than 100 words.
List of 20 tasks:
理解起来就是作者在list of 20 tasks后面跟上了三个seed
tasks,也就是给gpt打个样,让它知道按这个模式去生成。这里有个地方值得参考的:
logging.warning("Formatting inputs...") prompt_input, prompt_no_input = PROMPT_DICT["prompt_input"], PROMPT_DICT["prompt_no_input"] sources = [ prompt_input.format_map(example) if example.get("input", "") != ""else prompt_no_input.format_map(example) for example in list_data_dict ] targets = [f"{example['output']}{tokenizer.eos_token}"for example in list_data_dict]
logging.warning("Tokenizing inputs... This may take some time...") data_dict = preprocess(sources, targets, tokenizer)
Data collators are objects that will form a batch by using a list of
dataset elements as input. These elements are of the same type as the
elements of train_dataset or eval_dataset.
也就是你把所有的文本用tokenizer转化成input_ids和labels之后,要把他们组织成batch的形式,不仅如此,collator还能做一些数据处理的工作。它的输入就是我们之前的数据集,注意我们数据集的组织形式每一个数据sample它是一个字典,字典有两个key。所以羊驼这里首先将其拆分,
一句话解决,非常善于利用[ for
句式],让我写的话应该是写成非常冗余的两个for循环。
1
input_ids, labels = tuple([instance[key] for instance in instances] for key in ("input_ids", "labels"))