Commit 7cd3e57e authored by homsm's avatar homsm Committed by overleaf
Browse files

Update on Overleaf.

parent a3216777
......@@ -38,7 +38,7 @@ A marked graph $G=(V,A,del)$ consists of nodes (=actors) $v\in V$, edges $a=(v_i
\item Used for regular computations, e.g. signal flow graphs
\end{compactitem}
\textbf{Implementation of Marked Graphs}
\textbf{Implementation of Marked Graphs (10-12)}
\begin{compactitem}
\item Hardware implementation as \textcolor{red}{synchronous digital circuits}: Actors are combinatorial circuits, Edges are synchronously clocked shift registers (Everything synchronous $\implies$ The \# of items in a queue remains the same)
\item Hardware implementation as \textcolor{red}{self-timed asynchronous circuit}: Actors and FIFO queues independently implemented, coordination and synchronization of firings by handshake protocol ($\rightarrow$ delay insensitive implementation of the semantics)
......@@ -48,6 +48,7 @@ A marked graph $G=(V,A,del)$ consists of nodes (=actors) $v\in V$, edges $a=(v_i
\ownsubsection{Sequence Graph (SG) (10-16)}
A sequence graph is a \textbf{hierarchy of directed graphs}.\\
A sequence graph is a dependence graph with single start and end node.
\begin{compactitem}
\item It contains two kinds of nodes: operations/hierarchy nodes
\item Each graph is acyclic and polar with start and end node (NOP)
......@@ -55,9 +56,27 @@ A sequence graph is a \textbf{hierarchy of directed graphs}.\\
\subitem module call (CALL)
\subitem branch (BR)
\subitem iteration (LOOP)
\item $V_S$ denotes operations of the algorithm
\item $E_S$ denotes the dependence relations
\end{compactitem}
\begin{center}
\includegraphics[width=0.9\columnwidth]{mod3}
\vspace{0.1cm}
\includegraphics[width=0.9\columnwidth]{modx}
\end{center}
\ownsubsection{Extended Sequence Graph}
See Section Iterative Algorithms (10-56)
\ownsubsection{Resource Graph (10-16)}
(bipartite) $G_R = (V_R,E_R)$ where $V_R=V_S \cup V_T$ and $V_T$ denotes the resource types of the architecture ($V_S$ are the operations). An edge $(v_s,v_t) \in E_R$ represents availability of resource type $v_t$ for operation $v_s$.\\
Example: four instances of a multiplier with cost=8\\
\includegraphics[width=\linewidth]{images/resource.JPG}
\begin{itemize}
\item \textcolor{red}{Cost function} for operations $c: V_T \rightarrow \mathbb{Z}$.
\item \textcolor{red}{Execution times} $w: E_R \rightarrow \mathbb{Z}^{\geq 0}$ denote execution time of operation $v_s$ on resource type $v_t$.
\item An \textcolor{red}{allocation} $\alpha: V_T \rightarrow \mathbb{Z}^{\geq0}$ assigns to each resource type $v_t \in V_T$ the number $\alpha(v_t)$ of available resources.
\item A \textcolor{red}{binding} (which operation on which resource) is defined by $\beta: V_S \rightarrow V_T$ and $\gamma: V_S \rightarrow \mathbb{Z}^{\geq0}$. $\beta(v_s) = v_t$ and $\gamma(v_s)=r$ mean that operation $v_s \in V_S$ is implemented on the $r$-th instance of resource type $v_t \in V_T$.
\item A \textcolor{red}{schedule} $\tau: V_S \rightarrow \mathbb{Z}^{\geq0}$ determines the starting times of operations. It is feasible iff
$$\forall (v_i,v_j) \in E_S: \quad \tau(v_j)-\tau(v_i) \geq w(v_i) \defeq w(v_i,\beta(v_i))$$
\item \textcolor{red}{latency} $L$ of a schedule is the time between start node $v_0$ and end node $v_n$: $$L=\tau(v_n)-\tau(v_0)$$
\end{itemize}
\ No newline at end of file
......@@ -9,7 +9,8 @@ Determines a hardware architecture that efficiently executes a given algorithm.
\item \textcolor{red}{Binding}: Determine relation between individual operations and HW resources
\end{compactitem} \vspace{0.2cm}
\ownsubsection{Models (10-3)} \oldline
\ownsubsection{Models (10-3)} \\
see Architecture Models
\begin{compactitem}
\item \textcolor{red}{Sequence Graph} $G_S = (V_S,E_S)$ where $V_S$ denotes the operations and $E_S$ the dependence relations of the algorithm
\item \textcolor{red}{Resource Graph} (bipartite) $G_R = (V_R,E_R)$ where $V_R=V_S \cup V_T$ and $V_T$ denotes the resource types of the architecture ($V_S$ are the operations). An edge $(v_s,v_t) \in E_R$ represents availability of resource type $v_t$ for operation $v_s$.
......@@ -93,10 +94,10 @@ Model all timing constraints using relative constraints:
\end{alignedat}
\end{equation*}
Represent those restrictions as a \textcolor{red}{weighted constraint graph} $G_C=(V_C,E_C,d)$ related to a sequence graph $G_S=(V_S,E_S)$ that contains nodes $V_C=V_S$ (the operations) and a weighted, directed edge for each time constraint. $d(v_i,v_j)$ means $\tau(v_j)-\tau(v_i) \geq d(v_i,v_j)$. \newline
Represent those restrictions as a \textcolor{red}{weighted constraint graph} $G_C=(V_C,E_C,d)$ related to a sequence graph $G_S=(V_S,E_S)$ that contains nodes $V_C=V_S$ (the operations) and a weighted, directed edge for each time constraint. $d(v_i,v_j)$ means $\tau(v_j)-\tau(v_i) \geq d(v_i,v_j)$. ($d(v_i,v_j)$ ist das Gewicht vom Pfeil $v_i$ nach $v_j$) \newline
There is no valid schedule for the given timing constraints (assuming you have unlimited resources) if there is a positive circle (=the sum of all weights in the circle is positive) in the weighted graph.
\textcolor{red}{\textbf{Bellman-Ford-Algorithm}} to find optimal solution: \newline
\textcolor{red}{\textbf{Bellman-Ford-Algorithm}} to find optimal solution for the starting time of each taks: \newline
Start at $$\tau(v_0)=1$$
Iteratively set $$\tau(v_j) \defeq \max\{\tau(v_j),\tau(v_i)+d(v_i,v_j) : (v_i,v_j) \in E_C\}$$ for all $v_i \in V_C$ starting from
$$\forall v_i \in V_C\setminus\{v_0\}: \quad \tau(v_i) = -\infty$$
......@@ -112,7 +113,7 @@ $$\forall v_i \in V_C\setminus\{v_0\}: \quad \tau(v_i) = -\infty$$
% \end{equation*}
% where $L_{max}$ is an upper bound on the latency.
\textcolor{red}{\textbf{List Scheduling}} (widely used heuristic)
\textcolor{red}{\textbf{List Scheduling} (10-45)} (widely used heuristic)
\begin{compactitem}
\item Each operation has a static priority
......@@ -147,7 +148,7 @@ $$\forall v_i \in V_C\setminus\{v_0\}: \quad \tau(v_i) = -\infty$$
Produces the following (indep. of priorities)\\
\includegraphics[width=0.3\linewidth]{list_scheduling}
\textcolor{red}{\textbf{Integer Linear Programming}}
\textcolor{red}{\textbf{Integer Linear Programming (10-50)}}
\begin{compactitem}
\item Yields optimal solution
\item Solves scheduling, binding and allocation simultaneously
......@@ -222,10 +223,19 @@ Iterative Algorithms consist of a set of indexed equations that are evaluated fo
\begin{center}
\includegraphics[width=0.6\columnwidth]{arch1}
\end{center}
To each edge there is associated the index displacement
\item \textcolor{red}{Marked graph}:
\begin{center}
\includegraphics[width=0.6\columnwidth]{arch2}
\end{center}
\item \textcolor{red}{signal flow graph:}
\begin{center}
\includegraphics[width=0.6\columnwidth]{images/flowgraph.JPG}
\end{center}
\item \textcolor{red}{loop program:}
\begin{center}
\includegraphics[width=0.6\columnwidth]{images/loop.JPG}
\end{center}
\end{compactitem}
($\to$ essentially a sequence graph is executed repeatedly)
......@@ -240,6 +250,16 @@ Iterative Algorithms consist of a set of indexed equations that are evaluated fo
\item In case of \textcolor{red}{loop folding}, starting and finishing times of an operation are in different physical iterations.
\end{compactitem}
\end{definition}
\textbf{Implementation}
\begin{itemize}
\item \textbf{Simple possibility:}edges with $d_ij>0$ are removed from the extended sequence graph. The resulting sequence graph is implemented using standard methods.\\
\includegraphics[width=\linewidth]{images/simple.JPG}
\item \textbf{Functional pipelining:} Successive iterations overlap and higher throughput ($1/P$) is obtained\\
Calculate P with help of timing constraints.\\
with unlimited resources:\\
\includegraphics[width=\linewidth]{images/pipe.JPG}
\end{itemize}
\textbf{Solving Synthesis Problem using Integer Linear Programming} \newline
\begin{compactenum}
......@@ -247,7 +267,8 @@ Iterative Algorithms consist of a set of indexed equations that are evaluated fo
\item Use extended sequence graph, including displacements $d_{ij}$ (edge weights)
\item ASAP and ALAP scheduling for upper and lower bounds $h_i,l_i$. Use only edges with $d_{ij}=0$
\item Guess a suitable iteration interval $P$. If this is not feasible, increase $P$
\item Replace equation 5 with $$\forall (v_i,v_j) \in E_S: \quad \tau(v_j)-\tau(v_i) \geq w(v_i)-d_{ij}\cdot P $$
\item Replace equation 5 with $$\forall (v_i,v_j) \in E_S: \quad \tau(v_j)-\tau(v_i) \geq w(v_i)-d_{ij}\cdot P $$\\
proof on slide 10-65
\item Replace equation 6 with
\begin{equation*}
\begin{alignedat}{1}
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment