• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • Conversational AI / NLP

    ???? ?? RAG ????? ??

    Reading Time: 5 minutes

    ??? ???? AI ?? ?????? ???? ???? ?? ?? ??? ???? ???? ????? ?? ??? ??????. ???? ?? ?? ?? ????? ??? ?? ?? ??? ???? ??? ??? ? ? ??? ?????? ??? ??? ???? ?? ??????. ?? ?? ?? ???? ??? ?? ??? ?????? ??? ???? ??? ???? ??? ??? ?? ? ????.

    ?? ???? ?? ?? ??(RAG) ?????? ????? ? ??? ??? ??, ?? ?? ?? ?? ??(LLM)? ?? ???? ?? ???? ??? ??? ? ??? ?????. ??? ??? RAG ?????? ?? ????? ???? ? ? ?? ??? ??? ?? ??? ???? ??? ???? ???? ????? ???? ???? ? ? ?? ?????.

    ? ???? NVIDIA NeMo Retriever ??? NIM? ?????. ? ???? ???? ??? ?? ?? 16? ???? ???? Mistral-7B? ?? ??? LoRA ??? ????? ??????. ??? ??? ??? ??? ??? ?? ???? ????, ?? ?? ??? ?? ?? ?? ??? ?? ?????.

    ????? ??????

    ???? LLM? ?? ?? ?? ??? ???? ?? ??? ???? ??? ? ???? ??? ?????.

    ???? BM25 ?? ?? ??? ??? ?? ??? ?? ?? ??? ???? ?? ?? ?? ?? ??? ?????. ?? ?? ??? ?? ???? ??? ? ?? ?? ???? ???? ???? LLM? ?????. LLM? ??? ??? ???? ?? ???? ?? ??? ????? ?? ? ??? ??? ???????.

    ? ????? ??? ??? ??? ?? ??? ??? ??? ??? ?????? ?? ??? ??? ?? ??????. ???? ????? ?? ?? ?? ??? ?? ? ?? ??? ????, ?? ???? ?? ??? ????? ????? ???. ?? ?? ??? ??? ??? ??? ? ?? ?? ??? RAG ?????? ???? ?? ??? ?? ????? ????? ????? ? ?? ????.

    ??? ??? ?? ?? ???????? ??? NVIDIA NeMo Retriever ???? ?????? NVIDIA API ????? ?????.

    ???? ?? ??

    ? ????? ??? ????? ?? ???? ?? LLM ?? ?????? ?? ?? ??? ?????:

    ??

    ????? NVIDIA API ?????? ?? ??? ???? ?? ??? ????:

    ??? ?????.
    Python, API ? ??? ?????.
    ??? ?? NVIDIA_API_KEY? ?????.
    ?? ?????? ???? ? ????.

    ?? LangChain, NVIDIA AI ?????, FAISS? ?????:

    pip install langchain
    pip install langchain_nvidia_ai_endpoints
    pip install faiss-gpu

    ?? ?? ????

    ? ????? ???? LLM? ?? ?? NVIDIA ??? VILA: ?? ?? ??? ?? ?? ??? ?????. ? ???? ?? ???? ? ?? PDF? ?????, ??? ?? ???? ?? ??? ??? ? ????.

    from langchain_community.document_loaders import PyPDFLoader
     
    document = PyPDFLoader("2312.07533v4.pdf").load()

    ??? ??

    ???? ??? ??? ??? ?????.

    TextSplitter? chunk_size ????? ?????. RAG ?????? ?? ??? ??? ?? ???? ??? ??? ????? ?? ? ?? ?? ??? ??? ?? ??? ???? ?? RAG ??? ?? ?????. ?? ????? ????? ?? ??? ?? ?? ???? ?? ??? ?????.

    ?? ????(??? ??? ??? ??)? LLM? ???? ?? ??? ???. ?? ??? ?? ?? ???? ?? ?? ?? ??? ??? ????. ??? ?? ??? ??? ??, ???? ?? LLM? ?? 100~600????? ???.

    from langchain_text_splitters import RecursiveCharacterTextSplitter
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=200)
    texts = text_splitter.split_documents(document)
    ?? 1. ????? ??? ?? ???

    ??? ????

    ????, NVIDIA AI ????? ?????? ???? ???? ???? ??? ?? ??? ? ??? /embed ????? ???? ?? ???? ???? ?????.

    ? ???? ??? ??? ???? ??? ?? ? ?????? ?? ?????? FAISS? ?????. ???? ?? ??? ?? ??? ???? ????? ???? ???, RAM? ?? ?? ? ?? ???? ??? ? ????.

    from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
    from langchain_community.vectorstores import FAISS
     
    embeddings = NVIDIAEmbeddings()
    db = FAISS.from_documents(texts, embeddings)

    ?? retriever ???

    ?? ??? ???? ?? ???? ??? ??? ?? ???? ?? ??? ?????. ? ??? ??? ?? ????? ???? ??? ?? ???? ?? 45?? ??? ?????:

    retriever = db.as_retriever(search_kwargs={"k": 45})
     
    query = "Where is the A100 GPU used?"
    docs = retriever.invoke(query)

    ??? ?? ??

    ?? NeMo Retriever ??? NIM? ???? ??? ??? ?????. ??? ??? ??? ??? ?? ? ?? ??? ???? ?? ?? ??? ???? ? ???? GPU ?? ?????. ??? ??? ???? ?? ???? ?? ??? ?? ??? ??? ??? ??? ?? ????.

    NIM? LangChain ?? ?? ???? ???? ????, ?? ????? ?? ??? ???? ???? ? ?????? ??? ?????.

    from langchain_nvidia_ai_endpoints import NVIDIARerank
    from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
     
    reranker = NVIDIARerank()
    compression_retriever = ContextualCompressionRetriever(
        base_compressor=reranker, base_retriever=retriever
    )
     
    reranked_chunks = compression_retriever.compress_documents(query)

    ??? NIM? ?? ???? ?? ??? ?? ?? ?? ???? ??? ??? ???? ????, ? ???? A100 GPU? ???? ????:

    Table 10. The SFT blend we used during the ablation study.
     
    B. Training Cost
     
    We perform training on 16 A100 GPU nodes, each node
     
    has 8 GPUs. The training hours for each stage of the 7B
     
    model are: projector initialization: 4 hours; visual language
     
    pre-training: 30 hours; visual instruction-tuning: 6 hours.
     
    The training corresponds to a total of 5.1k GPU hours. Most
     
    of the computation is spent on the pre-training stage.
     
    We have not performed training throughput optimizations
     
    like sample packing [ 32] or sample length clustering. We
     
    believe we can reduce at least 30% of the training time with
     
    proper optimization. We also notice that the training time is
     
    much longer as we used a high image resolution of 336 ×336
     
    (corresponding to 576 tokens/image). We should be able to

    ?? ??? ??? ?? ??

    ?? ??? ??? ?? ???? ????? ? ???, ???? ???? RAG ??????? ?? ??? ??? ??? ? ????.

    ?? ?? ?? ??? ???? BM25 ???? ???? ??? ?????? ??? ?????. ? ???? ????? ???? ?? ???? ???? ??? ???? ??? ?????. ??? ???? ???? ???? ?? ?? ?? ???? ??? ?????.

    ?? ?? ??? ?? ??? ?? ??? BM25 ??? ??? ????. combined_docs? ??? ??? NIM? ?? ???? ???? ?? ?????.

    all_docs = docs + bm25_docs
     
    reranker.top_n = 5
     
    combined_docs = reranker.compress_documents(query=query, documents=all_docs)

    BM25 ??? ??? ??? ??? ??? /langchain-ai/langchain-nvidia GitHub ?????? ?? ???? ?????.

    RAG ?????? ??

    ???? ????? ???? ? ??? RAG ?????? ???? ?? ??? ???? ? ?? ???? ?? ??? ????? ???? ??? ?? ???? ? ????.

    ?? 2. ???? ?? ??? RAG ????? ????

    ? ?? ?? ??? compression_retriever ??? RAG ?????? ?????.

    from langchain.chains import RetrievalQA
    from langchain_nvidia_ai_endpoints import ChatNVIDIA
     
    chain = RetrievalQA.from_chain_type(
        llm=ChatNVIDIA(temperature=0), retriever=compression_retriever
    )
    result = chain({"query": query})
    print(result.get("result"))

    ?? RAG ?????? ??? ??? ??? ???? ?? ????? ?????:

    The A100 GPU is used for training the 7B model in the supervised 
    fine-tuning/instruction tuning ablation study. The training is 
    performed on 16 A100 GPU nodes, with each node having 8 GPUs. The 
    training hours for each stage of the 7B model are: projector 
    initialization: 4 hours; visual language pre-training: 30 hours; 
    and visual instruction-tuning: 6 hours. The total training time 
    corresponds to 5.1k GPU hours, with most of the computation being 
    spent on the pre-training stage. The training time could potentially 
    be reduced by at least 30% with proper optimization. The high image 
    resolution of 336 ×336 used in the training corresponds to 576 
    tokens/image.

    ??

    RAG? LLM? ??? ?? ??? ??? ??? ??? ?? ???? ??????. ??? ?? ??? ???? RAG ??? ????? ??? ? ?? ??? ?? ??? ???? ?? ??? ????? ?? ??? ?????? ??????? ?????.

    LLM? ?? ???? ?? RAG? ??? ???? ??? ??? ??? ???? ??? ? ?? ???? ??? ???? ???? ? ?? ? ??? ??? ? ?? ?????.

    ?? RAG ?????? ??? ?? ?? ???? ?? ??? ????? ??? ???? ??? ?? LLM? ???? ?? ??? ??? ??? ???? ???? ?? ?????. ??? ?? ?? LLM? ??? ??? ??? ?? ????. RAG ??? ????? ?? ??? ????? ??? ???? ??? ??? ?????.

    ?? ?? ? ??? ?? ??? ??? NVIDIA AI LangChain ?????? ?????.

    Discuss (0)
    +1

    Tags

    人人超碰97caoporen国产