java You are building a system that will access a collection

java

You are building a system that will access a collection of data for some scientific application and you need to select an appropriate data structure. Here are the details. The data entries are always long strings of characters, mid they always have different lengths (you will never be given two strings of the same length). You need to be able to insert new entries efficiently, and also you need to be able to retrieve (and delete from the data structure) the entry (string) that is the shortest among all in the data structure currently. Explain briefly what data structure you would use, how you would use it for each required operation mid what is the worst ease running time of each operation. Let T be a binary tree with n nodes. Define a Roman node to be a node v in T, such that the number of nodes in v\'s left subtree differ from the number of nodes in v\'s right subtree by at most 5. Describe a linear-time method for finding each node v of T, such that v is not a Roman node, but all of v\'s descendants are Roman nodes.

Trie data structure should be used.

Trie is an efficient information retrieval data structure. Using trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to M * log N, where M is maximum string length and N is number of keys in tree. Using trie, we can search the key in O(M) time. However the penalty is on trie storage requirements.

Every node of trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key as leaf node. A trie node field value will be used to distinguish the node as leaf node (there are other uses of the value field). A simple structure to represent nodes of English alphabet can be as following,

class TrieNode
{
boolean isLeafNode;
int value;

TrieNode[] children;

TrieNode(boolean isLeafNode, int value)
{
this.value = value;
this.isLeafNode = isLeafNode;
children = new TrieNode[ALPHABET_SIZE];
}
}

Inserting a key into trie is simple approach. Every character of input key is inserted as an individual trie node. Note that the children is an array of pointers to next level trie nodes. The key character acts as an index into the array children. If the input key is new or an extension of existing key, we need to construct non-existing nodes of the key, and mark leaf node. If the input key is prefix of existing key in trie, we simply mark the last node of key as leaf. The key length determines trie depth.

Searching for a key is similar to insert operation, however we only compare the characters and move down. The search can terminate due to end of string or lack of key in trie. In the former case, if the value field of last node is non-zero then the key exists in trie. In the second case, the search terminates without examining all the characters of key, since the key is not present in trie.

The following picture explains construction of trie using keys given in the example below,

root
/ \\ \\
t a b
| | |
h n y
| | \\ |
e s y e
/ | |
i r w
| | |
r e e
|
r

In the picture, every character is of type trie_node_t. For example, the root is of type trie_node_t, and it’s children a, b and t are filled, all other nodes of root will be NULL. Similarly, “a” at the next level is having only one child (“n”), all other children are NULL. The leaf nodes are in blue.

Insert and search costs O(key_length), however the memory requirements of trie is O(ALPHABET_SIZE * key_length * N) where N is number of keys in trie. There are efficient representation of trie nodes (e.g. compressed trie, ternary search tree, etc.) to minimize memory requirements of trie.

how to retrieve something from a trie (java code)

public LowercaseTrieVocabulary getNode(String s) {
   LowercaseTrieVocabulary node = this;
   for (int i = 0; i < s.length(); i++) {
       int index = LOWERCASE.getIndex(s.charAt(i));
       LowercaseTrieVocabulary child = node.children[index];
       if (child == null) {
           // There is no such word
           return null;
       }
       node = child;
   }
   return node;
}

explanation :- The LOWERCASE.getIndex(s.charAt(i)) method simply returns the position of the ith character in the alphabet. On the returned node, a Boolean property node indicates that the node corresponds to the last letter of a word, i.e. a letter marked in red in the previous example. Since each node keeps a counter of the number of children, if this counter is positive then there are longer words in the dictionary that have the current string as a prefix. Note: the node does not really need to keep a reference to the character that it corresponds to, because it’s implicit in its position in the trie.

inserting a value in trie code in java(algorithm)

Insertion:
Let\'s say we want to insert a key-value pair (\"abc\",1) into the trie.
1. We go to root node.
2. Get the index of the first character (\'a\') of \"abc\". That would be 0 in our alphabet system.
3. Go to 0th child of root. Because 0th child is null we first construct a TrieNode to which this 0th child would point. This newly constructed node would be node \'a\'. We mark this node as current node.
4. Now we get the index of the second character of \"abc\". That would be 1 and therefore we go to 1st child of current node(from step #3).
5. Here again, 1st child is null. We create a new TrieNode which would be node \'b\'. We mark this node as current node.
6. Now we get the index of the third character of \"abc\". That would be 2 and therefore we go to 2nd child current node(from step#5).
7. Here again, 2nd child is null. We create a new TrieNode which would be node \'c\'. We mark this node as current node.
8. At this step, we are done reading all the characters of the given key. Hence we mark the current node that is node \'c\' as leaf node and store value 1 at this leaf node.

Now at this point let\'s say we want to insert a key-value pair (\"abb\",9) into the trie.
1. We go to root node.
2. Get the index of the first character of \"abb\". That would be 0 in our alphabet system.
3. Go to 0th child of root. Now as you can notice, 0th child won\'t be null since we have constructed node \'a\' in the previous insertion sequence. We mark this node \'a\' as the current node.
4. Now we get the index of the second character(\'b\') of \"abb\". That would be 1, we go to 1st child of current node(from step #3).
5. 1st child of current node which is node \'b\' is not null. We mark this node \'b\' as current node. We now get the index of the last character (\'b\') of \"abb\". That index would be 1 and hence we go to 1st child of current node which is null. We create a new TrieNode which would be node \'b\'. Current node now points to this newly created node.
6. We are done reading all characters if given key \"abb\". We mark current node as leaf node and store value 9 in it.

Hopefully, these steps will help to understand the insertion algorithm better.

Search algorithm steps:
Example-1 : searching for non-existing key \"ac\"
1. Go to root node.
2. Pick the first character of key \"ac\" which would be \'a\'. Find out its index using alphabet system in use.
3. Index returned would be 0, go to 0th child of root which is node \'a\'. Mark this node as current node.
4. Pick the second character of key \"ac\" which would be \'c\'. Its index would be 2 and therefore we go to 2nd child of current node.
5. Now at this point, we find out that 2nd child of current node is null. If you notice, from insertion algorithm of a given key, no node in the key-path from the root could be null. If it is null then that implies that this key was never inserted in the trie. And therefore in such cases, we return \'KEY_NOT_FOUND\'.

Example-2 : searching for an existing key \"abb\"
The steps are very similar to example-1. We keep on reading characters of given key and according to indices of characters, travel from root node to node \'b\' which is at level-3 (if root is at level-0). At this point we would have read all the characters of the key \"abb\" and hence we return the value stored at this node.

The basic logic is to set isRoman field of a node to false even when a node\'s descendant (suppose grand grand child) is not roman. This means that even if a node satisfies the property (left and right descendants are at most 5) but if its left and right subtree\'s isRoman is false, then current node\'s isRoman will also be false.

below is the code for the problem

/*
* public class TreeNode {
* int val;
* int size; //stores the number of descendants including itself
* boolean isRoman; // is true when this node is roman and all its descendants are roman,
* //false even when a grand grand child node of this node is not roman.
* TreeNode left;
* TreeNode right;
* }
*/
public static void roman(TreeNode root, List<TreeNode> lst){
if(root == null)
return;
if(root.left == null && root.right == null){
root.size = 1;
root.isRoman = true;
return;
}
int left = 0;
int right = 0;
boolean isLeftRoman = false;
boolean isRightRoman = false;

if(root.left != null) {
roman(root.left,lst);
left = root.left.size;
isLeftRoman = root.left.isRoman;
}

if(root.right != null) {
roman(root.right,lst);
right = root.right.size;
isRightRoman = root.right.isRoman;
}

//if the current node is roman and all it\'s descendants are roman, update isRoman to true
if(Math.abs(left-right) <= 5 && isLeftRoman && isRightRoman)
root.isRoman = true;
else
root.isRoman = false;

//update the current node\'s size
root.size = left+right+1;

//add the node to list which is not roman but all its descendants are roman
if(Math.abs(left-right) > 5 && isLeftRoman && isRightRoman)
lst.add(root);
}

java You are building a system that will access a collection of data for some scientific application and you need to select an appropriate data structure. Here

java You are building a system that will access a collection

Solution

Get Help Now

Submit a Take Down Notice