Commit 7cd3e57e by homsm Committed by overleaf

### Update on Overleaf.

parent a3216777
 ... ... @@ -38,7 +38,7 @@ A marked graph $G=(V,A,del)$ consists of nodes (=actors) $v\in V$, edges $a=(v_i \item Used for regular computations, e.g. signal flow graphs \end{compactitem} \textbf{Implementation of Marked Graphs} \textbf{Implementation of Marked Graphs (10-12)} \begin{compactitem} \item Hardware implementation as \textcolor{red}{synchronous digital circuits}: Actors are combinatorial circuits, Edges are synchronously clocked shift registers (Everything synchronous$\implies$The \# of items in a queue remains the same) \item Hardware implementation as \textcolor{red}{self-timed asynchronous circuit}: Actors and FIFO queues independently implemented, coordination and synchronization of firings by handshake protocol ($\rightarrow$delay insensitive implementation of the semantics) ... ... @@ -48,6 +48,7 @@ A marked graph$G=(V,A,del)$consists of nodes (=actors)$v\in V$, edges$a=(v_i \ownsubsection{Sequence Graph (SG) (10-16)} A sequence graph is a \textbf{hierarchy of directed graphs}.\\ A sequence graph is a dependence graph with single start and end node. \begin{compactitem} \item It contains two kinds of nodes: operations/hierarchy nodes \item Each graph is acyclic and polar with start and end node (NOP) ... ... @@ -55,9 +56,27 @@ A sequence graph is a \textbf{hierarchy of directed graphs}.\\ \subitem module call (CALL) \subitem branch (BR) \subitem iteration (LOOP) \item $V_S$ denotes operations of the algorithm \item $E_S$ denotes the dependence relations \end{compactitem} \begin{center} \includegraphics[width=0.9\columnwidth]{mod3} \vspace{0.1cm} \includegraphics[width=0.9\columnwidth]{modx} \end{center} \ownsubsection{Extended Sequence Graph} See Section Iterative Algorithms (10-56) \ownsubsection{Resource Graph (10-16)} (bipartite) $G_R = (V_R,E_R)$ where $V_R=V_S \cup V_T$ and $V_T$ denotes the resource types of the architecture ($V_S$ are the operations). An edge $(v_s,v_t) \in E_R$ represents availability of resource type $v_t$ for operation $v_s$.\\ Example: four instances of a multiplier with cost=8\\ \includegraphics[width=\linewidth]{images/resource.JPG} \begin{itemize} \item \textcolor{red}{Cost function} for operations $c: V_T \rightarrow \mathbb{Z}$. \item \textcolor{red}{Execution times} $w: E_R \rightarrow \mathbb{Z}^{\geq 0}$ denote execution time of operation $v_s$ on resource type $v_t$. \item An \textcolor{red}{allocation} $\alpha: V_T \rightarrow \mathbb{Z}^{\geq0}$ assigns to each resource type $v_t \in V_T$ the number $\alpha(v_t)$ of available resources. \item A \textcolor{red}{binding} (which operation on which resource) is defined by $\beta: V_S \rightarrow V_T$ and $\gamma: V_S \rightarrow \mathbb{Z}^{\geq0}$. $\beta(v_s) = v_t$ and $\gamma(v_s)=r$ mean that operation $v_s \in V_S$ is implemented on the $r$-th instance of resource type $v_t \in V_T$. \item A \textcolor{red}{schedule} $\tau: V_S \rightarrow \mathbb{Z}^{\geq0}$ determines the starting times of operations. It is feasible iff $$\forall (v_i,v_j) \in E_S: \quad \tau(v_j)-\tau(v_i) \geq w(v_i) \defeq w(v_i,\beta(v_i))$$ \item \textcolor{red}{latency} $L$ of a schedule is the time between start node $v_0$ and end node $v_n$: $$L=\tau(v_n)-\tau(v_0)$$ \end{itemize} \ No newline at end of file
 ... ... @@ -9,7 +9,8 @@ Determines a hardware architecture that efficiently executes a given algorithm. \item \textcolor{red}{Binding}: Determine relation between individual operations and HW resources \end{compactitem} \vspace{0.2cm} \ownsubsection{Models (10-3)} \oldline \ownsubsection{Models (10-3)} \\ see Architecture Models \begin{compactitem} \item \textcolor{red}{Sequence Graph} $G_S = (V_S,E_S)$ where $V_S$ denotes the operations and $E_S$ the dependence relations of the algorithm \item \textcolor{red}{Resource Graph} (bipartite) $G_R = (V_R,E_R)$ where $V_R=V_S \cup V_T$ and $V_T$ denotes the resource types of the architecture ($V_S$ are the operations). An edge $(v_s,v_t) \in E_R$ represents availability of resource type $v_t$ for operation $v_s$. ... ... @@ -93,10 +94,10 @@ Model all timing constraints using relative constraints: \end{alignedat} \end{equation*} Represent those restrictions as a \textcolor{red}{weighted constraint graph} $G_C=(V_C,E_C,d)$ related to a sequence graph $G_S=(V_S,E_S)$ that contains nodes $V_C=V_S$ (the operations) and a weighted, directed edge for each time constraint. $d(v_i,v_j)$ means $\tau(v_j)-\tau(v_i) \geq d(v_i,v_j)$. \newline Represent those restrictions as a \textcolor{red}{weighted constraint graph} $G_C=(V_C,E_C,d)$ related to a sequence graph $G_S=(V_S,E_S)$ that contains nodes $V_C=V_S$ (the operations) and a weighted, directed edge for each time constraint. $d(v_i,v_j)$ means $\tau(v_j)-\tau(v_i) \geq d(v_i,v_j)$. ($d(v_i,v_j)$ ist das Gewicht vom Pfeil $v_i$ nach $v_j$) \newline There is no valid schedule for the given timing constraints (assuming you have unlimited resources) if there is a positive circle (=the sum of all weights in the circle is positive) in the weighted graph. \textcolor{red}{\textbf{Bellman-Ford-Algorithm}} to find optimal solution: \newline \textcolor{red}{\textbf{Bellman-Ford-Algorithm}} to find optimal solution for the starting time of each taks: \newline Start at $$\tau(v_0)=1$$ Iteratively set $$\tau(v_j) \defeq \max\{\tau(v_j),\tau(v_i)+d(v_i,v_j) : (v_i,v_j) \in E_C\}$$ for all $v_i \in V_C$ starting from $$\forall v_i \in V_C\setminus\{v_0\}: \quad \tau(v_i) = -\infty$$ ... ... @@ -112,7 +113,7 @@ $$\forall v_i \in V_C\setminus\{v_0\}: \quad \tau(v_i) = -\infty$$ % \end{equation*} % where $L_{max}$ is an upper bound on the latency. \textcolor{red}{\textbf{List Scheduling}} (widely used heuristic) \textcolor{red}{\textbf{List Scheduling} (10-45)} (widely used heuristic) \begin{compactitem} \item Each operation has a static priority ... ... @@ -147,7 +148,7 @@ $$\forall v_i \in V_C\setminus\{v_0\}: \quad \tau(v_i) = -\infty$$ Produces the following (indep. of priorities)\\ \includegraphics[width=0.3\linewidth]{list_scheduling} \textcolor{red}{\textbf{Integer Linear Programming}} \textcolor{red}{\textbf{Integer Linear Programming (10-50)}} \begin{compactitem} \item Yields optimal solution \item Solves scheduling, binding and allocation simultaneously ... ... @@ -222,10 +223,19 @@ Iterative Algorithms consist of a set of indexed equations that are evaluated fo \begin{center} \includegraphics[width=0.6\columnwidth]{arch1} \end{center} To each edge there is associated the index displacement \item \textcolor{red}{Marked graph}: \begin{center} \includegraphics[width=0.6\columnwidth]{arch2} \end{center} \item \textcolor{red}{signal flow graph:} \begin{center} \includegraphics[width=0.6\columnwidth]{images/flowgraph.JPG} \end{center} \item \textcolor{red}{loop program:} \begin{center} \includegraphics[width=0.6\columnwidth]{images/loop.JPG} \end{center} \end{compactitem} ($\to$ essentially a sequence graph is executed repeatedly) ... ... @@ -240,6 +250,16 @@ Iterative Algorithms consist of a set of indexed equations that are evaluated fo \item In case of \textcolor{red}{loop folding}, starting and finishing times of an operation are in different physical iterations. \end{compactitem} \end{definition} \textbf{Implementation} \begin{itemize} \item \textbf{Simple possibility:}edges with $d_ij>0$ are removed from the extended sequence graph. The resulting sequence graph is implemented using standard methods.\\ \includegraphics[width=\linewidth]{images/simple.JPG} \item \textbf{Functional pipelining:} Successive iterations overlap and higher throughput ($1/P$) is obtained\\ Calculate P with help of timing constraints.\\ with unlimited resources:\\ \includegraphics[width=\linewidth]{images/pipe.JPG} \end{itemize} \textbf{Solving Synthesis Problem using Integer Linear Programming} \newline \begin{compactenum} ... ... @@ -247,7 +267,8 @@ Iterative Algorithms consist of a set of indexed equations that are evaluated fo \item Use extended sequence graph, including displacements $d_{ij}$ (edge weights) \item ASAP and ALAP scheduling for upper and lower bounds $h_i,l_i$. Use only edges with $d_{ij}=0$ \item Guess a suitable iteration interval $P$. If this is not feasible, increase $P$ \item Replace equation 5 with $$\forall (v_i,v_j) \in E_S: \quad \tau(v_j)-\tau(v_i) \geq w(v_i)-d_{ij}\cdot P$$ \item Replace equation 5 with $$\forall (v_i,v_j) \in E_S: \quad \tau(v_j)-\tau(v_i) \geq w(v_i)-d_{ij}\cdot P$$\\ proof on slide 10-65 \item Replace equation 6 with \begin{equation*} \begin{alignedat}{1} ... ...

30.1 KB

images/loop.JPG 0 → 100644

39.7 KB

images/pipe.JPG 0 → 100644

111 KB

52.9 KB

images/simple.JPG 0 → 100644

70.7 KB

Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!