loading text from a db table into langchain - Stack Overflow

admin2025-04-25  2

I am just taking my first steps with langchain. I have text in a SQLite table that I want to load and chunk. While langchain docs say I can load a blob, I can't wrap my head around how to pass to the text splitter the text I select from the table. Here is my (wrong) code using better-sqlite3 and langchain

import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 1000, 
    chunkOverlap: 200
});

// given CREATE TABLE t (id INTEGER PRIMARY KEY, fulltext TEXT);
const rows = db.prepare('SELECT id, fulltext FROM t').all();

for (const row of rows) {
   const docs = [ { metadata: row.id, pageContent: row.fulltext } ];
   const chunks = await splitter.splitDocuments(docs);
}

// Error
file:///Users/punkish/Projects/zai/node_modules/@langchain/textsplitters/dist/text_splitter.js:102
                const loc = _metadatas[i].loc && typeof _metadatas[i].loc === "object"
                                          ^

TypeError: Cannot read properties of undefined (reading 'loc')

I am just taking my first steps with langchain. I have text in a SQLite table that I want to load and chunk. While langchain docs say I can load a blob, I can't wrap my head around how to pass to the text splitter the text I select from the table. Here is my (wrong) code using better-sqlite3 and langchain

import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 1000, 
    chunkOverlap: 200
});

// given CREATE TABLE t (id INTEGER PRIMARY KEY, fulltext TEXT);
const rows = db.prepare('SELECT id, fulltext FROM t').all();

for (const row of rows) {
   const docs = [ { metadata: row.id, pageContent: row.fulltext } ];
   const chunks = await splitter.splitDocuments(docs);
}

// Error
file:///Users/punkish/Projects/zai/node_modules/@langchain/textsplitters/dist/text_splitter.js:102
                const loc = _metadatas[i].loc && typeof _metadatas[i].loc === "object"
                                          ^

TypeError: Cannot read properties of undefined (reading 'loc')
Share Improve this question edited Jan 16 at 11:58 punkish asked Jan 16 at 11:39 punkishpunkish 15.3k27 gold badges77 silver badges112 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

Found the answer. I had to use splitter.splitText(fulltext) instead of splitter.splitDocuments().

转载请注明原文地址:http://anycun.com/QandA/1745534420a90881.html