I am just taking my first steps with langchain
. I have text in a SQLite table that I want to load and chunk. While langchain
docs say I can load a blob, I can't wrap my head around how to pass to the text splitter the text I select from the table. Here is my (wrong) code using better-sqlite3
and langchain
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200
});
// given CREATE TABLE t (id INTEGER PRIMARY KEY, fulltext TEXT);
const rows = db.prepare('SELECT id, fulltext FROM t').all();
for (const row of rows) {
const docs = [ { metadata: row.id, pageContent: row.fulltext } ];
const chunks = await splitter.splitDocuments(docs);
}
// Error
file:///Users/punkish/Projects/zai/node_modules/@langchain/textsplitters/dist/text_splitter.js:102
const loc = _metadatas[i].loc && typeof _metadatas[i].loc === "object"
^
TypeError: Cannot read properties of undefined (reading 'loc')
I am just taking my first steps with langchain
. I have text in a SQLite table that I want to load and chunk. While langchain
docs say I can load a blob, I can't wrap my head around how to pass to the text splitter the text I select from the table. Here is my (wrong) code using better-sqlite3
and langchain
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200
});
// given CREATE TABLE t (id INTEGER PRIMARY KEY, fulltext TEXT);
const rows = db.prepare('SELECT id, fulltext FROM t').all();
for (const row of rows) {
const docs = [ { metadata: row.id, pageContent: row.fulltext } ];
const chunks = await splitter.splitDocuments(docs);
}
// Error
file:///Users/punkish/Projects/zai/node_modules/@langchain/textsplitters/dist/text_splitter.js:102
const loc = _metadatas[i].loc && typeof _metadatas[i].loc === "object"
^
TypeError: Cannot read properties of undefined (reading 'loc')
Found the answer. I had to use splitter.splitText(fulltext)
instead of splitter.splitDocuments()
.