Hello,
Timwi wrote:
Can you explain in greater detail what your finite-state machine looks like? Is it like a directed graph?
Yes this is a directed graph, each transition from one node to another is labeled by one UTF-8 character. This graph is deterministic, you can not have two identic transitions at the output of a node. For each article I added the name of the article in this automaton, with the final node containing a pointer to articles details (url, name, nb of links). After article names were added in automaton, I computed the table of best content for each node containing pointer to articles with the higher number of links.
I talked about pointers, these pointers are index to a vector storing all details about articles.
You can have a small example (figure) in my FAQ : http://suggest.speedblue.org/FAQ.php
I can give you more details about number of nodes/transitions in the automaton if you want. I hope this give you a more precise idea of how it works.
Please, don't hesitate to ask me more details. Best Regards. Julien Lemoine