Information Technology Reference
In-Depth Information
16. Zheng, M., Ravi, V.T., Qin, F., Agrawal, G.: Grace: a low-overhead mechanism for detecting
data races in gpu programs. In: Proceedings of the 16th ACM Symposium on Principles and
Practice of Parallel Programming (2011)
17. Bekar, U.C., Elmas, T., Okur, S., Tasiran, S.: KUDA: GPU accelerated split race checker. In:
Workshop on Determinism and Correctness in Parallel Programming, WoDet (2012)
18. He, G., Zhai, A.: Improving the performance of program monitors with compiler support in
multi-core environment. In: IEEE International Symposium on Parallel Distributed Process-
ing, IPDPS (2010)
19. Chen, S., Falsafi, B., Gibbons, P.B., Kozuch, M., Mowry, T.C., Teodorescu, R., Ailamaki, A.,
Fix, L., Ganger, G.R., Lin, B., Schlosser, S.W.: Log-based architectures for general-purpose
monitoring of deployed code. In: Proceedings of the 1st Workshop on Architectural and
System Support for Improving Software Dependability (2006)
20. Bloom, B.H.: Space/Time Trade-offs in Hash Coding with Allowable Errors. Communica-
tions of ACM (1970)
21. Carter, J.L., Wegman, M.N.: Universal Classes of Hash Functions. In: ACM Symposium on
Theory of Computing (1977)
22. Xu, M., Bodik, R., Hill, M.: A ”flight data recorder” for enabling full-system multiprocessor
deterministic replay. In: International Symposium on Computer Architecture, ISCA (2003)
23. Prvulovic, M., Zhang, Z., Torrellas, J.: Revive: cost-effective architectural support for roll-
back recovery in shared-memory multiprocessors. In: International Symposium on Computer
Architecture, ISCA (2002)
24. NVIDIA Corporation: NVIDIA CUDA C Programming Guide,
http://www.nvidia.com
25. Xiao, S., Feng, W.C.: Inter-block gpu communication via fast barrier synchronization. In:
2010 IEEE International Symposium on Parallel Distributed Processing, IPDPS (2010)
26. Gonzalez-Alberquilla, R., Strauss, K., Ceze, L., Pinuel, L.: Accelerating data race detection
with minimal hardware support. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011,
Part I. LNCS, vol. 6852, pp. 27-38. Springer, Heidelberg (2011)
27. Sack, P., Bliss, B.E., Ma, Z., Petersen, P., Torrellas, J.: Accurate and efficient filtering for the
intel thread checker race detector. In: Proceedings of the 1st Workshop on Architectural and
System Support for Improving Software Dependability (2006)
28. Magnusson, P., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Lars-
son, F., Moestedt, A., Werner, B.: Simics: A Full System Simulation Platform. Computer
(2002)
29. Martin, M.M.K., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen, A.R.,
Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet's General Execution-driven Multiprocessor
Simulator (GEMS) Toolset. SIGARCH Computer Architecture News (2005)
30. Bakhoda, A., Yuan, G., Fung, W., Wong, H., Aamodt, T.: Analyzing cuda workloads using
a detailed gpu simulator. In: International Symposium on Performance Analysis of Systems
and Software, ISPASS (2009)
31. Agarwal, N., Krishna, T., Peh, L.S., Jha, N.: GARNET: A Detailed On-chip Network Model
Inside a Full-system Simulator. In: ISPASS (2009)
32. Bienia, C., Kumar, S., Singh, J.P., Li, K.: The parsec benchmark suite: characterization and
architectural implications. In: International Conference on Parallel Architectures and Com-
pilation Techniques, PACT (2008)
33. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The splash-2 programs: characteri-
zation and methodological considerations. In: International Symposium on Computer Archi-
tecture, ISCA (1995)
 
 
Search WWH ::




Custom Search