funcmain() { var numerator int = 11 var denominator int = 2 var result, remainder int = intDivision(numerator, denominator) fmt.Println(result, remainder) } funcintDivision(num1, num2 int) (int, int) { var result int = num1 / num2 var remainder int = num1 % num2 return result, remainder }
var slice []int32 = []int32{1, 2, 3} //length is 3, with capacity is 3 fmt.Printf("length is %v, with capacity is %v\n", len(slice), cap(slice)) slice = append(slice, 4) //length is 4, with capacity is 6 fmt.Printf("length is %v, with capacity is %v\n", len(slice), cap(slice)) fmt.Println(slice) }
// 动态方法创建 funcsliceDynamic() { //可以指定构造长度和容量,这里构造了一个长度为3,容量为20的slice var intSlice []int32 = make([]int32, 3, 20) fmt.Printf("length is %v, with capacity is %v\n", len(intSlice), cap(intSlice)) //前三个元素是初始化了的,后面没有 for i := 0; i < len(intSlice); i++ { fmt.Println(intSlice[i]) } //就跟ArrayList一样,如果业务能够预估动态数组的长度,最好还是构造的时候就提前设定好 //否则会频繁进行扩容,影响效率 } //取数逻辑。左闭右开,跟java中subString类似 funcslicePartition() { sli := []int{1, 2, 3, 4, 5, 6} fmt.Printf("len=%d cap=%d slice=%v\n", len(sli), cap(sli), sli)
Young-only phase: This phase starts with a few Normal young collections that promote objects into the old generation. The transition between the young-only phase and the space-reclamation phase starts when the old generation occupancy reaches a certain threshold, the Initiating Heap Occupancy threshold. At this time, G1 schedules a Concurrent Start young collection instead of a Normal young collection.
Concurrent Start : This type of collection starts the marking process in addition to performing a Normal young collection. Concurrent marking determines all currently reachable (live) objects in the old generation regions to be kept for the following space-reclamation phase. While collection marking hasn’t completely finished, Normal young collections may occur. Marking finishes with two special stop-the-world pauses: Remark and Cleanup.
Remark: This pause finalizes the marking itself, performs global reference processing and class unloading, reclaims completely empty regions and cleans up internal data structures. Between Remark and Cleanup G1 calculates information to later be able to reclaim free space in selected old generation regions concurrently, which will be finalized in the Cleanup pause.
Cleanup: This pause determines whether a space-reclamation phase will actually follow. If a space-reclamation phase follows, the young-only phase completes with a single Prepare Mixed young collection.
Space-reclamation phase: This phase consists of multiple Mixed collections that in addition to young generation regions, also evacuate live objects of sets of old generation regions. The space-reclamation phase ends when G1 determines that evacuating more old generation regions wouldn’t yield enough free space worth the effort.
Shallow heap is the memory consumed by one object. An object needs 32 or 64 bits (depending on the OS architecture) per reference, 4 bytes per Integer, 8 bytes per Long, etc. Depending on the heap dump format the size may be adjusted (e.g. aligned to 8, etc…) to model better the real consumption of the VM.
X的Retained set表示当X被GC垃圾回收后需要移除的对象列表
Retained set of X is the set of objects which would be removed by GC when X is garbage collected.
X的Retained heap是Retained set里所有对象的Shallow大小
Retained heap of X is the sum of shallow sizes of all objects in the retained set of X, i.e. memory kept alive by X.
The next level of the tree lists those objects that would be garbage collected if all incoming references to the parent node were removed.
图12 Dominator Tree
以上图为例,占用堆内存最大的是TaskThread的http-no-8080-exec-2线程,其本身大小是Shallow Heap是120字节,Retained Heap是2417669960字节,占用整个堆内存94.90%。图中将AspectJExpressionPointcut展开,表示当AspectJExpressionPointcut被内存回收之后,展开列表里的所有对象都会被回收,也就是他的retained set
看堆栈信息,一般来说是从上到下找到首个业务代码进行分析,从图26可以看出从业务代码ChatManagerImpl.java:300处添加一个元素到列表,最后触发了容器扩容,最终导致OutOfMemoryError。并且这个线程在执行copyOf时持有很大的内存大小Max Local Retained Heap(本地变量保留大小),已经定位到业务代码了,接下来就根据业务代码去看看原因。
# langchain 调用searxng示例, 获取top结果 from langchain_community.utilities import SearxSearchWrapper s = SearxSearchWrapper(searx_host="http://localhost:8180") s.run("what is a large language model?")
import httpx from typing import List, Optional,Tuple import asyncio headers = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36'}
async def get_result(url: str): if not is_direct(url): # 非直连 async with httpx.AsyncClient(proxy='socks5://127.0.0.1:1080') as client: try: response = await client.get(url,headers=headers,timeout=10.0) # 设置超时 return url, response except httpx.RequestError: return url, None else: async with httpx.AsyncClient() as client: try: response = await client.get(url,headers=headers,timeout=10.0) return url, response except httpx.RequestError: return url, None
async def get_results( urls: List[str]): tasks = [get_result(url) for url in urls] results = await asyncio.gather(*tasks) for url, response in results: if response is None: print(f"URL: {url} - Failed to connect") # else: # print(url, response.text[:100]) return results
For multilingual, utilize BAAI/bge-reranker-v2-m3 and BAAI/bge-reranker-v2-gemma For Chinese or English, utilize BAAI/bge-reranker-v2-m3 and BAAI/bge-reranker-v2-minicpm-layerwise. For efficiency, utilize BAAI/bge-reranker-v2-m3 and the low layer of BAAI/ge-reranker-v2-minicpm-layerwise. For better performance, recommand BAAI/bge-reranker-v2-minicpm-layerwise and BAAI/bge-reranker-v2-gemma
@app.post("/v1/chat/completions", response_model=ChatCompletionResponse) async def create_chat_completion(request: ChatCompletionRequest): global model, tokenizer
if len(request.messages) < 1 or request.messages[-1].role == "assistant": raise HTTPException(status_code=400, detail="Invalid request")
gen_params = dict( messages=request.messages, temperature=request.temperature, top_p=request.top_p, max_tokens=request.max_tokens or 1024, echo=False, stream=request.stream, repetition_penalty=request.repetition_penalty, tools=request.tools, ) logger.debug(f"==== request ====\n{gen_params}") for each_message in request.messages: info = str(each_message.role) +"\:" +str(len(each_message.content)) logger.debug(f"==== ===== ===") logger.debug(f"==== message len ====\n{info}") logger.debug(f"==== ===== ===") # Here is the handling of stream = False response = generate_llama3(model, tokenizer, gen_params)
# Remove the first newline character if response["text"].startswith("\n"): response["text"] = response["text"][1:] response["text"] = response["text"].strip()
return ChatCompletionResponse( model=request.model, id="", # for open_source model, id is empty choices=[choice_data], object="chat.completion", usage=usage )
# 计算token 数 def num_tokens_from_string(string: str) -> int: """ Returns the number of tokens in a text string. use cl100k_base tokenizer """ encoding = tiktoken.get_encoding('cl100k_base') num_tokens = len(encoding.encode(string)) return num_tokens
# embedding 接口返回数据格式 response = { "data": [ { "object": "embedding", "embedding": embedding, "index": index } for index, embedding in enumerate(embeddings) ], "model": request.model, "object": "list", "usage": CompletionUsage( prompt_tokens=sum(len(text.split()) for text in request.input), completion_tokens=0, total_tokens=sum(num_tokens_from_string(text) for text in request.input), ) } return response
The following text is taken from the official website of a company named: [bi-ehealthcare]. Based on the content provided, generate 6-8 self-questions and answers related to the company's information. The question is information related to the company: the company's products, industry, services, address and contact information (telephone, email, etc.), and other points of fact mentioned in the content provided, but not limited to this. Provide quality responses and avoid general responses such as "Unfortunately, the text does not provide..." . Avoid interference from privacy policies and cookies, and ensure that the total length of the reply is not less than 500 characters. In particular. Organize and translate your answers in English, as shown below: Q1: XXXX, A1: XXXX, Q2: XXXX, A2: XXXX