Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Related Flashcards

Flashcards
»
Cluster

Cluster

by jannapureddy, Sep. 2015

Favorite

Add to folder

Flag

Related Essays

Nt1330 Unit 3 Assignment 1
Firstly, go to the Hive by writing it in the command prompt and press enter as shown in figure 33. And create the table in it with following command: hive> c...
Nt1310 Unit 4 Assignment 4 Analysis
The output files we are generating are savd in the HDFS only in the file
Carlson San Approach Case Study
These goals are as follows: implement an enterprise data warehouse, build a global network, move to enterprise-wide architecture, established six-sigma quali...
Service Quality In Hospital Essay
• Medical staff should spend more time with the patients by avoiding unnecessary waste of time with other work. • Lean process helps in reducing waiting ti...
Unit 5 Lab 5.5
Although the datastores are assigned as thin provisioned disks to the hypervisor the storage admins still cannot control individual LUNs from the array side ...
Nt1310 Unit 4 Test Paper
Its implementation involves the loss of an entire disks storage space. 2. Can survive failure of only ONE drive but this failure would slow down performanc...
Nt1330 Unit 7 Exercise 1
6 8. The following parameters are calculated for each of the node in each of the server wings: Voltage, Temperature, Fan Speed, CPU Utilization. After w...
ABC Products Company Cost Savings Program: Case Study
2. Top management commitment, they can walk through production area and office area every week and comment on cleanliness and organized arrangement. 3. Saf...
Nt1330 Unit 1 Paper
Wondering if they are putting in the right space or not. Information Technology personal will not have to go to each user machine and copy their data to the ...
Lean Six Sigma Essay
to solve some of the top line problems before stretching it. Problem/Demand = Customer desire - Current state 3.2.. Implementing Lean Six Sigma

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/20

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

20 Cards in this Set

Front
Back

	Test disk I/O speed	Command to test disk I/O speed hdparm -t Speed should be 70MB/sec or more. Anything less is an indication of problem.
	OS parameters	Set vm.sawppiness to 0 in /etc/sysctl.conf Use ext3/etx4 filesystem, Recommended ext4 Increase ulimit for mapred and hdfs user to atleast 32k. Recommended 64k (/etc/security/limits.conf) Disable IPv6 Disable SELinux Install and Configure NTP daemon
	Unix user accounts	The HDFS, MapReduce, and YARN services are usually run as separate users, named hdfs, mapred, and yarn, respectively. They all belong to the same hadoop group
	Formatting HDFS filesystem	The formatting process creates an empty filesystem by creating the storage directories and the initial versions of the namenode’s persistent data structures. Datanodes are not involved in the initial formatting process, since the namenode manages all of the filesystem’s metadata, and datanodes can join or leave the cluster dynamically.
	start-dfs.sh	As hdfs user Starts a namenode on each machine returned by executing hdfs getconf -namenodes Starts a datanode on each machine listed in the slaves file Starts a secondary namenode on each machine returned by executing hdfs getconf -secondarynamenodes
	start-yarn.sh	As yarn user Starts a resource manager on the local machine Starts a node manager on each machine listed in the slaves file
	Environment variables	HADOOP_CLASSPATH hadoop-env.sh HADOOP_HEAPSIZE hadoop-env.sh JAVA_HOME hadoop-env.sh HADOOP_NAMENODE_OPTS hadoop-env.sh HADOOP_LOG_DIR hadoop-env.sh HADOOP_IDENT_STRING hadoop-env.sh HADOOP_SSH_OPTS hadoop-env.sh YARN_RESOURCEMANAGER_HEAPSIZE yarn-env.sh
	Important HDFS daemon properties	fs.defaultFS core-site.xml dfs.namenode.name.dir hdfs-site.xml dfs.datanode.name.dir hdfs-site.xml dfs.namenode.checkpoint.dir hdfs-site.xml
	Importnat YARN daemon properties	yarn-site.xml yarn.resourcemanger.hostname yarn.resourcemanger.address ($y.rm.hostname:8032) yarn.nodemanager.local-dirs yarn.nodemanager.aux-services yarn.nodemanager.resource.memory-mb (8192) yarn.nodemanager.resource.cpu-vcores (8) yarn.nodemanager.vmem-pmem-ratio (2.1)
	Mapreduce job memory/cpu properties	mapreduce.map.memory.mb (1024) mapreduce.reduce.memory.mb (1024) mapred.child.java.opts (-Xmx200m) mapreduce.map.java.opts (-Xmx200m) mapreduce.reduce.java.opts (-Xmx200m) mapreduce.map.cpu.vcores (1) mapreduce.reduce.cpu.vcores (1)
	Default RPC Ports	Namenode 8020 Datanode 50020 Job History 10020 Resource Manager 8032 Resource Manager Admin 8033 Resource Manager Scheduler 8030 Resource Manager resource tracker 8031 Node Manager 0 Node Manager localizer 8040
	Default http ports	Namenode 50070 Seconday Namenode 50090 Datanode 50075 Jobhistory 19888 mapreduce shuffle 13562 Resource Manager 8088 Node Manager 8042
	Cluster membership	In hdfs-site.xml for datanodes dfs.hosts (include filename) dfs.hosts.exclude ( exclude filename) In yarn-site.xml for node managers yarn.resourcemanager.nodes.include-path yarn.resourcemanager.nodes.exclude-path
	I/O buffer size	Default 4 kb Recommended 128 kb Set the property in bytes in core-site.xml io.file.buffer.size
	Block Size	default 128 MB Set the property in bytes in hdfs-site.xml dfs.blocksize
	Datanode reserve storage space	Set the property in bytes in hdfs-site.xml dfs.datanode.du.reserved
	Trash	Hadoop filesystem has Trash facility fs.trash.interval in minutes default value 0 User level feature .Trash folder for every user. hadoop fs -expunge
	Reduce Slow start	default 5% Setting mapreduce.job.reduce.slowstart.completedmaps to a higher value, such as 0.80 (80%), will help improve throughput.
	Short circuit local read	Enable short-circuit local reads by setting dfs.client.read.shortcircuit to true. The path is set using the property dfs.domain.socket.path, and must be a path that only the datanode user (typically hdfs) or root can create, such as /var/run/hadoop-hdfs/dn_socket.
	Configuration Precedence	Highest to Lowest Code CLI Client Slave Cluster Default If a value in configuration file is marked final it overrides all others.

Share This Flashcard Set