Free Republic
Browse · Search
News/Activism
Topics · Post Article

To: general_re
"You have added keywords A, B, and C. Previous articles with keywords A, B, and C often include keyword D. Would you like to add keyword D to this article?" Something like that?

Yeah, something like that. How much overhead would it require? Have no idea. I'm not a programmer. (Which, as you note, makes it easy to think up stuff for others to code!) I would guess, though, that you'd have your keywords in a separate database to which the keyword field in the database of articles (or article headers?) would have a one-to-many relationship. Then I suppose you could have some kind of optimized index describing the relationship of keywords to each other (maybe only tracking some subset of more commonly used keywords). You'd then update this index a couple times a day, at off peak hours.

'Course for all I know the keywords might just be a variable sized text-field within the article record, which would make all this more difficult I presume.

239 posted on 11/25/2003 11:48:40 AM PST by Stultis
[ Post Reply | Private Reply | To 130 | View Replies ]


To: Stultis
It shouldn't require too much more overhead than a regular user-initiated search, depending on how the data is organized. Suppose I'm searching for articles by keyword, and I want all the articles with the keywords "California", "Schwarzenegger", and "budget", so I do a Boolean "AND" search to connect those three keywords. And the system returns all articles that match all three keywords.

Now suppose I post an article, and add those three keywords - in order to make suggestions, the system will do that same Boolean "AND" search, to find all the articles with those three keywords, same as a user-initiated search. Then, it has an extra step of comparing the keyword lists of the articles that that search returns in order to see if they hit some predetermined target for another common keyword - for example, if 75% of the articles returned by searching on the first three keywords also have the keyword "deficit", then it could suggest "deficit" as an additional keyword for the current article. Or something like that.

I'm not sure there would be any advantage to not writing index changes on-the-fly - you're going to have to write them somewhere, even if you just want to accumulate the changes and rewrite the index later, which makes index adds a two-step process, instead of the single step of merging on-the-fly. That way, you'd end up writing changes somewhere temporary, and then rewriting them later on when you want to add them to the index. Presumably the server is already capable of deferred-writes when it's under load anyway, so I don't think formalizing that system will really gain you much, but it might be worth examining for a bit just in case...

273 posted on 11/25/2003 12:37:11 PM PST by general_re (Take away the elements in order of apparent non-importance.)
[ Post Reply | Private Reply | To 239 | View Replies ]

Free Republic
Browse · Search
News/Activism
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson