Created: 2022-07-04
Tags: #fleeting
Abstract:
Theoretically, on input of size n, an algorithm with a running time of n is “asymptotically equivalent,” in terms of O, to an algorithm with a running time of 3n.
Indeed, when describing the running time of an algorithm, we typically focus on the most impactful term (i.e., n in this case, since n could be much larger than 2).
In the real world, though, the fact of the matter is that 2n feels twice as slow as n.
The challenge ahead of you is to
Files
dictionary.c from disk into memorydictionary.h. speller.c and dictionary.c can #include the file.Preprocessor Directive
#define LENGTH 45
defines a “constant” called LENGTH that has a value of 45.
Constant in the sense that you can’t change it in your own code.
clang will replace mentions of LENGTH in the code with, literally, 45.
bool check(const char *word);
unsigned int hash(const char *word);
bool load(const char *dictionary);
char * is what we used to call string.
So those three functions above are essentially just:
bool check(const string word);
unsigned int hash(const string word);
bool load(const string dictionary);
And const, makes it so that strings passed will not get changed
you won’t be able to change them accidentally or smtnh
$ ./speller text
will be equivalent to running
$ ./speller dictionaries/large text
text is the file you wish to spell-check. Suffice it to say, the former is easier to type! (Of course, speller will not be able to load any dictionaries until you implement load in dictionary.c! Until then, you’ll see Could not load.)
TIME IN load represents the number of seconds that speller spends executing your implementation of load.
TIME IN check represents the number of seconds that speller spends, in total, executing your implementation of check.
TIME IN size represents the number of seconds that speller spends executing your implementation of size.
TIME IN unload represents the number of seconds that speller spends executing your implementation of unload.
TIME IN TOTAL is the sum of those four measurements.
“misspelled” simply mean
dictionary provided.Alright, the challenge now before you is to
implement, in order,
load,
hash,
size,
check,
unload
as efficiently as possible using a hash table
in such a way that TIME IN load, TIME IN check, TIME IN size, and TIME IN unload are all minimized.
global pointer array, table
hash table = struct called node
Array contains N node pointers, N = 26 to match w/ default hash function.
You'll likely increase N depending on your own implementation of hash
We’ve implemented load, check, size, and unload, but only barely, just enough for the code to compile.
We’ve implemented hash with a sample algorithm based on the first letter of the word.
-> Your job, ultimately, is to re-implement those functions as cleverly as possible so that this spell checker works fast!
Do NOT ALTER
speller.cmakefileload, hash, size, check, or unload. Can Alter
N value in dictionary.c, so that your hash table can have more buckets.Can add functions
dictionary.c. (and, in fact, must in order to complete the implementations of load, hash, size, check, and unload)load
Hash Function
-> assigns a number to every point
Steps
-> Open dictionary file
fopen()
make sure its value NULL
-> Read strings from file one at a time
fscanf(file, "%s", word)
fscanf returns EOF once it reaches End Of File
-> Create a new node for each word
Store each of those words in the hash table
Use malloc to each word to store each node
Use this strcpy("n->word", "Hello") means Copy hello to n->word
-> Hash word to obtain a hash value (the index)
Function takes string and returns an Index
-> Insert node into hash table at that location
Set the new node to point to the first position in linked list
Then you can point the first position in linked list to point at the new node
check
hello in dictionary, then check returns true for all cases HELLO, hElLO, heLLO, heLlO foo is in dictionary, check should return false given foo's if foo's is not also in dictionary.check will only be passed words that contain (uppercase or lowercase) alphabetical characters and possibly apostrophes.true if word is in dictionarystrcasecmp is a function that compares strings but ignores casing)hash
First one letter is good, but it's not fast
First two letters
or using math to use all letters
With three letters
Some letters in 3rd position are missing (Meaning A B ?? D E F G)
Okay, so instead of getting each letter. We could use for loop to get the letters
When we use for loop, do we ignore the apostrophe????
AB =
BA =
I do not understand what the end result of my hash function will be
I have no idea how many buckets my hash function will get
size
unload
Dictionary File
\n)