Archive, Archive, 미분류

Java Interview Guide:

http://www.amazon.com/Java-Interview-Guide-Confidence-Understanding-ebook/dp/B015HF9SJQ/ref=tmm_kin_swatch_0?_encoding=UTF8&qid=1456322150&sr=8-8

41CXQSaHBoL._SX311_BO1,204,203,200_

Java Interview Guide: How to Build Confidence With a Solid Understanding of Core Java Principles

A technical interview can be a harrowing experience. You only have a short time to convince someone that you’re worth hiring, so you can’t afford to go to a job interview unprepared. Unfortunately, memorizing the answers to interview questions just doesn’t cut it. You need to understand the underlying concepts.

기본
algorithms, 미분류

Greedy Algorithm, Minimum Spanning Tree

2.5_lecture9

Greedy Algorithm

->look for an edge, if seems like good thing to add. They add the edge, repeat until done

Once it’s added as soln, it never goes back and tries sth else

 

Prim’s algorithm

->starts w/ any vertex in its current tree and builds a single tree out of there by always taking a step in current tree and finding the smallest edge from the tree to the rest of the tree

->can choose any vertex to start w/

->start w/ Queen. Look at all edges out of the current tree and find the smallest one HK (10)

->look at smallest edge out of either Q or JK to any other vertex in the graph

->Q to HK(35)

->look for same thing out of Q/JK/HK : HK to JR (0.5)

=>you can have only one tree growing at a time

->only vertex left: A : A to JR

Kruskal’s algorithm

->sorts the edges from smallest to largest edge, and always adds the smallest edge as long as it is connecting to vertices not already connected by a path : JR to HK(0.5)

->next smallest edge:HK to A(2)

->next smallest edge: Q to JF(10)

=>you can have multiple trees at various points in time

->it’s only at the very end of the algorithm you’ll have a single tree

-> next smallest edge: HK to A(5) : but we don’t add!

->we already have a path HK to A in our soln

-> next smallest edge: HK to Q(35): add this since there is no path btw them

 

=>same tree w/ diff algorithms!

->which one is better?

->it’s not clear that either gets the correct m.s.t

how do we prove it gets the very best spanning tree?

=>prove these algorithms work

->see how greedy algorithms work

 

how to prove greedy algorithm gets the correct soln?

->proof by contradiction!

->Claim: algorithm never makes a 1st mistake

->prove it was not a mistake and we can build a m.s.t.

->inductive proof=>into a proof by contradiction

 

->Prim’s algorithm never makes a 1st mistake (=a mistake in which we add an edge so that after that point we cannot build a m.s.t including all the edges we’ve added so far

->contradiction; there is a 1st mistake

->we have to argue: not a mistake

->1st mistake: tree at some point on i v ertices (Ti)

->at any time added some edge x-y btw vertex that was in the tree and vertex that wasn’t in the ree

=>1st mistake the algorithm makes

–               ->at this point it was a m.s.t that included all the edges

->there was a m.s.t and it included Ti as part of that tree and also other edges as well

->all edges of Ti were part of some m.s.t =>Tm

->but there is no m.s.t that contains Ti and x-y (b/c it’s supposed to be a mistake)

->take Tm, add x-y

->it will have a cycle

->since in m.s.t, there was already a path from x to y and we add edge x-y

->cycle C: includes x and y

->x is in Ti and y is not in Ti

->cycle, at some point in time, it eventually have to get to sth not in Ti

->some vertex u in Ti must cross over to some vertex v not in Ti

->to get from sth in Ti to sth not in Ti, we have to cross over

->edge e we added

->cost of e to cost of e2?

->Prim’s algorithm chose edge e c(e) <= c(e2)

->could not choose to e

-> also true: c(e)>c(e2) => contradiction

->could be no first mistake we made

 

->take m.s.t, track edge u-v and add x-y

->Unions…

=>removing u-v and adding x-y

->Tm: -e2 + xy

->still a spanning tree

if there was connection btw 2 vertices in Tm

->also a connection btw them in graph

->it is a spanning tree.

->However, it can’t be a minimum spanning tree

->WHynot?

->

 

->local improvement :keep sat any point in time

 

기본
algorithms, 미분류

Directed Acyclic Graphs

2.1_lecture8

In order for algorithms to run quickly on DAC is simply finding Topological sort

->you can get the reverse of topological sort as you back up from DFS.

How to find Topological Sort?

->keep track of vertices w/ indegree 0

->1) go through the graph & create indegrees of all the vertices

->O(n+m) to find indegree of every vertex

-> simply go through all the edges and each time we have an edge from x to y, we add 1 to in-degree of y

-> At any point, if we reach a step where we haven’t eliminated all the output vertices and there’s no vertex w/ in-degree 0, then there must be a cycle in the graph

->this will continue to completion if directed graph has no cycles

->if it has cycles, you can never use first vertex on a cycle

->none of the vertices can be eliminated b/c every vertex in a cycle has in-degree at least 1

 

->그냥 indegree순으로 나열하면 안되는 이유

=>Critical path: longest path in a directed acyclic graph

->find longest path to each vertex in order of topological sort of the vertex

->longest path to a vertex w/ in-degree 0 = 0

->when I come to a vertex, I look at all of the edges into that vertex

->find the longest path into that vertex, how would I choose that?

->I would have an array that has cost to each vertex put in it

->continue w/ every vertex

=>longest path in the graph = largest # I would write down associated w/ any vertex

-> finding longest path        

                 ->in O(n+m), we find topological sort

->we go to each vertex, we look at all the edges into that vertex

->once

->shortest path in DAC?

->same as above pick the shortest one!

 

기본
algorithms, 미분류

Strong Connectivity

1.29_lecture7

Strong Connectivity: for every pair of vertices (x,y), there is a path from x to y and a path from y to x

->you can reach any vertex from any other vertex

->DFS to solve it

How to determine whether a graph is strongly connected?

->by taking any vertex x and seeing first using DFS that x has a path to every vertex

->and then reverse all those edges

->reversal of G: do a DFS on this graph =>will see if every vertex has a path to x

->If the graph is strongly connected, it is true that x has a path to every vertex and every vertex has a path to x

->with these properties, you can get btw any y and z; b/c we know y has a path to x and x has a path to z

->linear time w/o knowing DFS

 

Dividing into strongly connected component

-> strongly connected component: we can’t add anything to it and still have a strongly connected graph

->connected component: part of a graph that is connected but if you add anything else, it would no longer be a connected graph -> true for biconnected/ strongly connected component

-> in O(n+m) using DFS? (similar to biconnected components)

->Cross edge!?

->does this help us get back or not?

->Maybe

->example where it helps/ doesn’t help

 

->back(i) = min{back(children), DFS#(backedges), sometimes DFS#(crossedge)}

->how do we know which cross edge helps?
=>rule’s a little different: when we’re breaking off a strongly connected component, we’re only breaking off at the exact vertex

->we do not include the parent if it’s an strongly connected component

->rule: is the back(i) smaller than the DFS#(i) ->we do not break off component at this point

=>if back(i) >= DFS#(i) => break off component!

 

if not smaller, ->we break off the strongly connected component which is part of the tree rooted at f

=>f-g-h-i: one strongly connected component

 

=>->back(i) = min{back(children), DFS#(backedges), DFS#(crossedge) if not already broken off}

 

->left one: not has yet been broken off->  this portion of graph can somehow reach back to one of its ancestors, you can get down to vertex d.

=> O(n+m) finding A.P. and breaking off into Strongly connected components

->why?

->each time I back up from vertex, I simply lookes at all of the edges out of that vertex

->if I’m looking at all edges once, that’s proportional to # of edges in the graph

 

->Strongly connected component graph

->take each strongly connected component of the graph and turn it into a single node of s.c.c. graph

->every v is in exactly one s.c.c.

->edge btw c1 and c2 if there’s any v in c1 that has an edge to any v in c2.

->can build in O(n+m) time once you have broken into s.c.components

=>no cycles! ***

->if there is a cycle, anything in this component can get to c2 ..all of the vertices would be able to get to all of the other vertices involved in the cycle

->not a s.c.c!

->we can’ t add sth to it and still have a s.c.c.

 

->directed graph w/o cycles

->scheduling problem should not have cycles

->how to check if there are cycles in a graph in O(n+m) time?

->if there is some s.c.c. w/ more than one v, than there’s a cycle within that s.c.c

->if every v is in its own s.c.c, than there can be no cycles b/c s.c.c graph has no cycles

 

->critical path: longest path

기본
algorithms, 미분류

Biconnectivity

1.27_lecture6

Articulation Point: if removing one vertex can disconnect the graph

Biconnected: removing any single vertex cannot disconnect the graph = no AP!

 

->Finding the longest path that doesn’t repeat any vertices

->It’s difficult to find the longest in a general graph.

->If your graph is not biconnected, it’s easy to find the longest path btw vertices

->You’ll have to go through all the A.P. on the way

->instead of being exponential on the whole graph, you’re exponential on each of the biconnected pieces

->enormous savings if you got an exponential algorithm to divide the graph into pieces and be exponential only on individual pieces

 

->if you shrink biconnected components into single vertices, it turns the graph you’re dealing w/ into a tree

->we identified all the A.P and broken into biconnectec components all while doing DFS

->key point:

->no cross edges: you don’t have to worry about edges btw children

->when you back up from children, do calculations on them before you back up from the parent.

->Running time: O(n+m) for finding A.P and disconnecting the graph into biconnected components.

->simply doing DFS. When I back up from a vertex, my calculation just corresponds to looking at all of the edges out of that vertex. If I’m looking at all of the edges once, that’s proportional to the # of edges in the graph

->why not cover BFS?: its uses are obvious while DFS is not

->BFS: how ppl on fb are closely connected ..

->DFS on Directed Graphs

->Edge types btw vertices of the graph & how they relate to the tree

->Undirected Graphs: Back edges btw ancestor-descendant, Cross edges that are neither btw ancestor-descendant, tree edges

->Directed:

->tree edges,

->cross edges

->Back edges;

->forward edges: along the direction of the tree

->back edges: opposite direction of the tree

 

W/ respect to DFS

DFS on a undirected graph: we did not have cross edges

What types of edges can we have w/ respect to DFS on a directed graph:

->tree edges

->forward edges: I’d go a to b, b to c, I would back up all the way. edge a-c would not be in the tree but it would be a forward edge.

->back edge: now I went down to a to d, d to e, and e has an edge to a, I wouldn’t put in a tree b/c a’s already been visited

->cross edges : why not from c to d?

->if c is the one I visited first, everything’s descendent until I back up from c

->If c had a smaller DFS# and you had an edge from c to d,              it would not be a cross edge.

->You would make d a child of c instead.

–               ->Only from HIGHER DFS# => LOWER DFS#

 

-> one way you could define connectivity in undirected graphs:

->you can’t divide it into 2 pieces so that there are no edges btw the two pieces

->weak connectivity:

->just throw away the directions on the edges and check whether the ‘un’directed graph is connected

 

-connectivity?

->there is a path btw every pair of vertices. ?

Strong connectivity: for every pair of vertices x and y, there’s both a path from x to y and y to x

 

기본
algorithms, 미분류

DFS

1.25_lecture5

How to implement DFS?

->fundamental operation in DFS: take an edge going out of that vertex

->Adjacency List

->can find an edge going out of a vertex in constant time (AM: O(n))

 

DFS Tree

-DFS #: order in which we encounter during DFS

->only include the edges that get us to new edges in DFS

-DFS using AL

->second pointer that corresponds to where we are in the current A.L.

->each entry of AL one times

->if you do DFS, you really need this extra pointer for every vertex to show where in the A.L you currently are/

->these also have to be organized in an array

->you wanna say now I’m going to 2’s AL. I don’t wanna have to step through all the list to figure out which one it is

->Array of lists -> in constant time, we can go to next thing that hasn’t been looked at on 2’s A.L

->What if this graph was not connected?

-> DFS Forest: set of trees

->go through and go until you come to a vertex which has not been visited by DFS and restart DFS from that time

->Running time of doing DFS on a graph: O(n+m)

->we are eventually going to look at every entry of every AL ONCE

->total size of AL: twice of degree of the graph: twice the number of edges: n

->why add m?: graph with no edges at all -> still full DFS

->why use a tree instead of a stack?

->trees have a special property that enable certain algorithms run faster.

->ex) BFS

->there are edges that are in the graph but not in the tree

-> Back edge (in undirected graph): sth that goes btw ancestor-descendent & you usually don’t count if it’s a tree edge

->important property of BFS tree: graph has no back edges w/ respect to its BFS tree!

->there are other edges that may be in the graph but not in the tree.

->tree edges: edges that are in the tree

->back edge: ancestor-descendant but not parent-child

->in BFS, there can be no edges of the graph which are back edges w/ respect to the BFS tree

->cross edge: not ancestors-descendants of each other

->BFS: you can have cross edges!

->DFS: it can have back edges but NOT cross edges ****

->WHY?

->if it has a cross edge, a path from root would have to split off in 2 diff directions.

->Suppose there was a cross edge from Y to Z.

->if you didn’t split off in 2 directions, then one of them would have to be ancestor of another

->suppose you do split off on 2 directions,

->hit Y first, and there was an edge from Y to Z, before u backed up from Y, you would look at that edge and Z would not have been visited yet.

->you would make Z a child instead of backing up from Y

->everything until you back up from Y will going to be descendants of Y

->w/ respect to DFS tree, there are no cross edges!

 

 

->Find all x such that G-X is disconnected

->if that node goes down, we lose communication btw some pairs of nodes

=> “Articulation Points”

->how can it possibly divide the network?

->how far back can we reach?

=> back(v): furthest back I could reach from subtree using a single edge

=> min. DFS# reachable via a single edge from sub-tree rooted at v

->can I reach from here beyond vertex 2 otherwise 2 cuts off from the rest

-> back(v) =min(back(child(v)),DFS# of any edge out of v)

->DFS using O(m+n) time

-> EXCEPTION:Root is an A.P iff it has 2 children

->Every other vertex is an A.P iff one of its children has marked it as an A.P

 

 

기본
algorithms, 미분류

Graphs

1.20_lecture4

Definitions:

-A dot: vertex/node/point

-Line (capturing connections btw objects): edge/link/line/art

->Default assumption: graph is undirected! , Loopless!, No Multiple Edges!

-Path: sequence of edges/vertices

Simple path: no repeated vertices! (except possibly the 1st and the last vertex)

-Degree of Vertex

->common running time:

->sum of degrees of all the vertices

->how big a graph is?

->we say running time of many algorithms is proportional to size of the input

->what is the size of input?

->In a graph, as opposed to when you have a bunch of #s, some running times depend on # of edges

->some algorithms, on the other hand, never look at the edges: they loop multiple times through the vertices

->which has bigger?

->one has more edges, one has more vertices

->depends on how your algorithm does work

->n = # of vertices, m = # of edges

->sum of degree = twice of edges

->when you have a loop, what is D.O.V? => count as 2!

->why sum of degree = # of edges?

-> reason: you can add edges one by one to the graph. If you have an edge btw x and y and that edge 1 to degree of x and 1 to degree of y. So each edge we add 2 to the total degree.

->reason you make the degree discount for 2 is that it stays true even if you have loops in the graph.

->representations of graph in a computer

->1) Adjacency Matrix: n vertices-> n by n matrix / 1 if there is an edge, 0 if there isn’t

->2) Adjacency Lists: array of lists / vertices are #ed from 1 to n /default assumption: list is not necessarily ordered! -> easier to add, remove edge / store as linked lists

-> Which is better by more than a constant factor?

-> 1) Is X adjacent to Y?

-AM: O(1)

-AL: O(n)

-> 2) Find ‘Neighbors’ of a vertex X

-AM: O(n)

-AL: O(1)

-> which type of operations are you using?

-> What is the max # of edges you can have in a graph, not allowing multiple edges?

->you can have  edges

->AM is actually better on space.

->AM:  bits of space

->AL: 2m (sum of degree) -> #s (upto logn bit)

->AL can use more space depending on how many edges are in the graph.

 

->Algorithms for searching a graph: DFS/BFS

-> DFS: Adjacency Lists

기본
algorithms, 미분류

Order Notation

1.15_lecture3

 

What people try to do is something that is in fact trying to do all of this at once

Here’s an algorithm. What is its worst-case running time?

->people say here’s the worst case and time it takes on this case is this amount of time

->what’s wrong with this argument?

->it’s hard to prove that sth is the worst case!

->and you don’t need to show that sth is the worst case

->ex) to show running time of sth is , all you have to do is if you wanna show it can be as bad as , you give me a case on which it takes  time. Then, I know that its running time is at least a constant times . Proving sth is the worst case is always very hard. If you declare sth to be the worst case, I’ll challenge you as to why it’s the worst case

->ex) show me the worst case running time of bubble sort is )

->NO: here’s the worst case -> why on earth should I believe it’s the worst case? -> no reason to believe this

->as long as you place 1 as the last element, bubble sort will still have to run through a full number of passes

->WHY? : every pass can move the element ‘1’ down at most one position. To finish in bubble sort, ‘1’ will have to be moved all the way to the beginning. For any input in which 1 comes last is going to be as bad bubble sort as this example.

-> NEVER PROVE  by declaring a case to be the worst

->why? : 1) hard to prove it’s the worst 2) you don’t have to prove a case is the worst case

->Here’s how you do it instead:

->2 different arguments:

1) O(n) : structure of algorithm

->you not looking at any particular example,

->no matter what the algorithm does, it can’t take more than alphabetic times because of the algorithm

->How would you do this for bubble sort (argue about the structure of the algorithm) ?

->you would make an argument that bubble sort makes at most n passes through the data

->bubble sort as an algorithm will be some sort of a loop which is for i=1 to i=n, run through bubble sort. =>at most n passes of bubble sort =>each pass: O(n)

=> conclusion: bubble sort is O(

->you can never take more than constant times

-> NOT DONE YET! : need to show there are inputs which take  times

->in this case: generalized example

->NO: bubble sort can take  times b/c if you give me input 4,3,2,1 you can count the number of operations and it looks like

->even if you could show a bubble sort will take only 8 operations/comparisons on that, even if you could show 16 comparisons on a list of size 4 -> doesn’t mean it’s  ->16 could also correspond to 4 times n or 8 times log n

->when convince me that for ALL values of n, you could have an example of in which you took  times, just give one example

->on the other hand, a generalized example: for every value of n, I would have an example for bubble sort

->you could make the example where bubble sort must run through its entire

->n-1 for 1st pass, n-2 for 2nd pass, …. 4, 3, 2, 1

->would work for any example where 1 is the smallest element put last in your list

->arithmetic notation:

=>these two together show me that the worst case running time looks like a constant times

->notice that I never argued this was the worst case.

->I simply said on this case, it will take a constant times constant times . there may be cases which take more time but they can’t take more than  times because they know from the structure of the bubble sort.

 

Within algorithm, we have several parts:

->Design

->Analysis: given an algorithm, what is the (mostly worst-case) running time?

->Complexity: how hard a problem (not how good)) is

->running time of the best possible algorithm to solve the problem

->measured in terms of O(f(n))

->how to know the best algorithm for sorting  O(

->you give an algorithm that solves in O(f(n)) time

->selection sort solves the sorting problem in

->not  bound : we don’t know there might be another algorithm

–               ->merge sort in nlog n time and we would argue that the beset algorithm for sorting is O(nlogn)

->how to show  bound and show that the best algorithm for sorting takes at least a constant times sth

->we don’t know how to do that!

->we can only say an algorithm is the best algorithm to solve the problem for very trivial reason

->there are many problems where we can say the running time of algorithm must be at least the size of input/output(>input)

->sort/find largest: you have to look at every #s

->output>input: give all paths from s to t / you gotta write all the output down

->outside that, we don’t know how to prove lower bounds on the complexity of a problem

What people try to do is something that is in fact trying to do all of this at once

Here’s an algorithm. What is its worst-case running time?

->people say here’s the worst case and time it takes on this case is this amount of time

->what’s wrong with this argument?

->it’s hard to prove that sth is the worst case!

->and you don’t need to show that sth is the worst case

->ex) to show running time of sth is , all you have to do is if you wanna show it can be as bad as , you give me a case on which it takes  time. Then, I know that its running time is at least a constant times . Proving sth is the worst case is always very hard. If you declare sth to be the worst case, I’ll challenge you as to why it’s the worst case

->ex) show me the worst case running time of bubble sort is )

->NO: here’s the worst case -> why on earth should I believe it’s the worst case? -> no reason to believe this

->as long as you place 1 as the last element, bubble sort will still have to run through a full number of passes

->WHY? : every pass can move the element ‘1’ down at most one position. To finish in bubble sort, ‘1’ will have to be moved all the way to the beginning. For any input in which 1 comes last is going to be as bad bubble sort as this example.

-> NEVER PROVE  by declaring a case to be the worst

->why? : 1) hard to prove it’s the worst 2) you don’t have to prove a case is the worst case

->Here’s how you do it instead:

->2 different arguments:

1) O(n) : structure of algorithm

->you not looking at any particular example,

->no matter what the algorithm does, it can’t take more than alphabetic times because of the algorithm

->How would you do this for bubble sort (argue about the structure of the algorithm) ?

->you would make an argument that bubble sort makes at most n passes through the data

->bubble sort as an algorithm will be some sort of a loop which is for i=1 to i=n, run through bubble sort. =>at most n passes of bubble sort =>each pass: O(n)

=> conclusion: bubble sort is O(

->you can never take more than constant times

-> NOT DONE YET! : need to show there are inputs which take  times

->in this case: generalized example

->NO: bubble sort can take  times b/c if you give me input 4,3,2,1 you can count the number of operations and it looks like

->even if you could show a bubble sort will take only 8 operations/comparisons on that, even if you could show 16 comparisons on a list of size 4 -> doesn’t mean it’s  ->16 could also correspond to 4 times n or 8 times log n

->when convince me that for ALL values of n, you could have an example of in which you took  times, just give one example

->on the other hand, a generalized example: for every value of n, I would have an example for bubble sort

->you could make the example where bubble sort must run through its entire

->n-1 for 1st pass, n-2 for 2nd pass, …. 4, 3, 2, 1

->would work for any example where 1 is the smallest element put last in your list

->arithmetic notation:

=>these two together show me that the worst case running time looks like a constant times

->notice that I never argued this was the worst case.

->I simply said on this case, it will take a constant times constant times . there may be cases which take more time but they can’t take more than  times because they know from the structure of the bubble sort.

 

Within algorithm, we have several parts:

->Design

->Analysis: given an algorithm, what is the (mostly worst-case) running time?

->Complexity: how hard a problem (not how good)) is

->running time of the best possible algorithm to solve the problem

->measured in terms of O(f(n))

->how to know the best algorithm for sorting  O(

->you give an algorithm that solves in O(f(n)) time

->selection sort solves the sorting problem in

->not  bound : we don’t know there might be another algorithm

–               ->merge sort in nlog n time and we would argue that the beset algorithm for sorting is O(nlogn)

->how to show  bound and show that the best algorithm for sorting takes at least a constant times sth

->we don’t know how to do that!

->we can only say an algorithm is the best algorithm to solve the problem for very trivial reason

->there are many problems where we can say the running time of algorithm must be at least the size of input/output(>input)

->sort/find largest: you have to look at every #s

->output>input: give all paths from s to t / you gotta write all the output down

->outside that, we don’t know how to prove lower bounds on the complexity of a problem

기본